-
**Description**
Checkpoint/restore inside docker is slow.
(apologies if this the wrong place to report, but even a closed bugreport about this would have saved me quite some time.)
**Step…
-
Could you please provide the process for reproducing the training of the 'cousin_ckpt.pth' and 'twin_ckpt.pth' files? Thank you.
-
Hi! I get this issue by following the inference notebook:
```
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for…
-
Hi, can you provide your checkpoints?
-
### System Info
- GPU: NVIDIA H100 80G
- TensorRT-LLM branch main
- TensorRT-LLM commit: 535c9cc6730f5ac999e4b1cb621402b58138f819
### Who can help?
@Tracin
### Information
- [x] The official e…
-
Thank you for your excellent work on this project! I was wondering if you could kindly release the trained model checkpoint. It would be incredibly helpful for further exploration and application of y…
-
I tried to load Lora training adapters from Deepspeed checkpoint:
dir:
```
ls Bunny/checkpoints-llama3-8b/bunny-lora-llama3-8b-attempt2/checkpoint-6000
total 696M
-rw-r--r-- 1 schwan46494@gmail.c…
-
workload is opensbi and Image not bbl. checkpoint is a kernel 6.12(GCV) and speccpu2017(GC) benchmark.
When executing the write system call, the kernel performs vector copy and the following error …
-
Could you provide documentation on running inference with a finetuned Cogvideox-fun model? My finetune has output into something other than diffusers, and it seems like inference only supports the ori…
-
I have downloaded checkpoint from hugging face and provided path to it:
checkpoint = timesfm.TimesFmCheckpoint(
version="pytorch",
path=r"Downloads\checkpoint",
#huggingface_repo_id=…