-
Hello everyone, I'm encountering a memory issue while fine-tuning a 7b model (such as Mistral) using a repository I found. Despite having 6 H100 GPUs at my disposal, I run into out-of-memory errors wh…
-
您好,我在指定本地路径后,仍然需要到 HuggingFace 下载,结果报错。我希望知道哪里设置错了。我想训练 llava-llama3-8b。
- 命令:`NPROC_PER_NODE=1 xtuner train llava_llama3_8b_instruct_quant_clip_vit_large_p14_336_e1_gpu1_pretrain --deepspeed deepspe…
-
Hello and many thanks for your ffmpeg library for Android.
After reading about the new 64bit requirement for Android of native libraries, I noted that your library is not stored in Libs but in Asse…
-
The development of full Flux support will take some time since the better way to load it may be to use a separate UNET, and will complicate the UI to be able to define the CLIP, T5 and VAE separately,…
-
**Describe the bug**
Hello,Can some one get Help. I use V0.14.3, installed from source code tar.gz: https://github.com/melMass/DeepSpeed/releases
I use deepspeed Zero3, and training LLama Factory KT…
-
As reported in https://github.com/ggerganov/llama.cpp/issues/6944#issuecomment-2101577066
The llama.cpp tokenizers give different results than HF for old GGUF files.
This is a subtle footgun and…
-
## Describe the bug
When using a locally loaded GGUF model with mistral.rs in Rust applications, the memory allocated for the model is not being released properly after the model object goes out of s…
-
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f648' in
position 0: character maps to
During handling of the above exception, another exception occurred:
+-----------------…
-
### System Info
Since it it possible to have multiple Peft adapters in the same model, it should also be possible to resume a training of such models from checkpoint with [transformers.Trainer.train(…
-
### System Info
- `transformers` version: 4.38.2
- Platform: Linux-6.1.58+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.20.3
- Safetensors version: 0.4.2
- Accele…