-
here is my code:
from torch_ema import ExponentialMovingAverage
model = ...
optimizer = ...
scheduler = ...
ema_model = ExponentialMovingAv…
-
@tomaarsen hello tom. I hope you will good.
I am trying to add deepspeed in sentence transformer training argument via deepspeed= "deepspeed_config.json" and also try with accelerate config but it'…
-
Thanks for the excellent work! I wonder about the training device on text2image. The paper says it is trained on a single A100, but it seems the settings in table 15 should take more than 640GB of mem…
-
Hello,
I'm working on a multi-classification model and the result looks good. But when I train the model after using model.prune(), train_loss and test_loss become NaN easily even if I set lr and ste…
-
Hi, while executing:
`torchrun --nproc_per_node gpu -m sae meta-llama/Meta-Llama-3-8B --distribute_modules --batch_size 1 --layers 24 25 --grad_acc_steps 8 --ctx_len 2048 --k 192 --load_in_8bit --mic…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### YOLOv8 Component
Train
### Bug
…
-
I’m working on a multi task classification with DistilBert with 4 labels, based on your repo, and I was wondering if maybe you could help me, since I'm having a hard time trying tor each the hugging f…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…
-
code:
import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments
import pyreft
from huggingface_hub import login
login(token="***")
model_n…
-
This extension is much more efficient and simple to use than kohya. I like it a lot!
However I am having a frequent issue where it will fill up the memory right before training. After the following…