-
### Describe the issue
Issue:
I have finetuned mistral llava model with a sample dataset and the training was well
here is the commands of training
```
deepspeed llava/train/train_mem.py -…
-
I am trying to finetune Qwen-2.5 Coder-7B-Instruct on my custom dataset but am getting the following error:
``
ValueError: Unsloth: Untrained tokens of [[]] found, but embed_tokens & lm_head not t…
-
### Before submitting your bug report
- [X] I believe this is a bug. I'll try to join the [Continue Discord](https://discord.gg/NWtdYexhMs) for questions
- [X] I'm not able to find an [open issue](ht…
-
### What is the issue?
Mixtral 8x22b instruct outputs are either empty or gibberish.
I have tried various quantizations: q4, q4_k_m, q5, etc. All seem problematic.
Other models (e.g., llama3, com…
PLK2 updated
4 months ago
-
All fine tunes for Mistral 7b using sagemaker jumpstart are currently failing with:
"ImportError: cannot import name 'insecure_hashlib' from 'huggingface_hub.utils' (/opt/conda/lib/python3.10/site-…
-
### Describe the bug
I'm experiencing an unexpected behavior when trying to load the following model:
Model name: Mistral-Large-Instruct-2407-IMat-GGUF
Quantization: Q6_K, size 100.59GB
When…
ro99 updated
3 months ago
-
It might be useful to have a workbook for TripleO users which does the following:
1. Uploads a directory of playbooks to a Swift container (and runs them from there)
2. Builds the Ansible Inventor…
-
**Is your feature request related to a problem? Please describe.**
Tried to run custom 40B model, whose weights can be loaded with 2 80GB GPU's VRAM.
lmcache is able to load small models with in sin…
-
————————env lib detail:——————————————
inf2.24xlarge
ubuntu@ip-172-31-12-212:~/vllm$ pip list|grep -i neuron
aws-neuronx-runtime-discovery 2.9
libneuronxla 2.0.755
neuro…
-
First, thanks for your great works. I've tried to finetune the Yi-Coder-9B-Chat models on my own dataset but here comes the problems.
## Problems
'grad_norm' becomes nan when I try to finetune t…