-
As discussd in issue [https://github.com/unslothai/unsloth/issues/154#issue-2119969174](url) , I am also working with extended tokenizer to accomodate words of a new language. I've merged Llama 3.2 to…
-
### 🐛 Describe the bug
There is no fake implementation or meta kernel for the Communication Operator. If I want to contribute to this feature, what can I do? Are there any examples that I can refer…
-
全参微调的超参
```
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 NPROC_PER_NODE=8 swift sft \
--model_type qwen2-vl-7b-instruct \
--model_id_or_path /data/seeclick/pretrain_v3_qwen2vl/Qwen2-VL-7B-Instruct \
…
-
Hi, Thank you for the amazing work! When I run `python train_nhp.py --experiment_id IntensityFree_train`, I found it only printed out loglike and num_events, like this:
INFO: [ Epoch 450 (train) ]:…
-
I'm just going through the tutorial and this is not working for me:
`pip install ydf -U`
```
# Load libraries
import ydf # Yggdrasil Decision Forests
import pandas as pd # We use Pandas to load s…
-
![image](https://github.com/user-attachments/assets/ba16afd1-e0bf-420c-986d-ce89d97694bc)
-
Hi, I am using deepspeed zero3 to fine tune flux model using the script - flux_train_network.py.
```
flux = accelerator.unwrap_model(flux)
print(f"flux - {flux.state_dict()['single_blocks.7.linea…
-
Hello, thank you for sharing such an amazing code repository. I encountered a few issues while using this repository to train my own dataset. My training set has ten thousand images, with only one tar…
-
# 🐛 Bug
Fantasization / conditioning model on new data points renders the model unexportable to TorchScript/not traceable with JIT. Models cannot be JIT traced/exported to Torchscript once `get_fan…
-
After running the preprocessing steps as are mentioned in the README.md file, when I run the bash train.sh on my custom dataset formatted in the mvtec-ad directory structure, I get the following resul…