gradient-accumulation Search Results

1000+ results
for gradient-accumulation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeed #4417

LAMB requires dynamic loss scaling

**Is your feature request related to a problem? Please describe.** LAMB optimizer does not support BF16 training. When I used the LAMB optimizer with BF16 training, I met the error ``` DeepSpeed …

jihaonew updated 9 months ago
7
modelscope/ms-swift #2079

I tried to fine tune the instructions for qwen2-7B instructi…

zfy1041264242 updated 1 month ago
3
ShivamShrirao/diffusers #165

[launch_inpaint.sh] AttributeError: 'LatentsDataset' object …

### Describe the bug ![Screenshot_11](https://user-images.githubusercontent.com/13344308/206963169-f052cb4f-00db-4dc9-a375-78a9292a1fd6.jpg) **My launch_inpaint.sh** export LD_LIBRARY_PATH=/u…

ZeroCool22 updated 1 year ago
2
mbzuai-oryx/GeoChat #50

Could you describe the procedure of reproduce the GeoChat?

Dear @salman-h-khan , Thanks for your fantastic work GeoChat, I am really interested in it. And the ckpt provided by you works for me. However, when I tried to reproduce it as a beginner of the …

Amazingren updated 1 week ago
1
lm-sys/FastChat #2043

Fine tune llama 2

Two questions regarding llama 2 fine tuning: 1. it seems the prompt templates defaults to `vincuna` and cannot overwritten according to the following code: https://github.com/lm-sys/FastChat/blob/…

lkluo updated 1 year ago
5
huggingface/notebooks #185

image_classification_albumentations.ipynb failing with Targe…

Hi, I have used a similar dataset as image_classification_albumentations.ipynb and reused the notbook code completely but model training failing with Target size (torch.Size([32, 224, 224, 3])) mus…

amitkml updated 2 years ago
2
Weifeng-Chen/control-a-video #18

Question about training CAV

Hi, I'm sincerely glad that you shared your great work! I tried to reimplement the training logic of CAV but had some troubles.. Can you take a look at what might be the problem? train.py: ```…

SSUHan updated 5 months ago
1
brian6091/Dreambooth #52

Error when run colab notebook

i get the below error when i run training cell in colab FineTuning_colab.ipynb also run cell Training parameters and all parameter parsed No LSB modules are available. Description: Ubuntu 20.04.…

ken2190 updated 11 months ago
3
kohya-ss/sd-scripts #816

sdxl lora training seems to consume much more gpu memory tha…

Specifically, sdxl_train v.s. sdxl_train_network I have compared the trainable params, the are the same, and the training params are the same. As a result, batch size 10 --> 4 otherwise an gpu memor…

xiao2mo updated 1 year ago
1
huggingface/transformers #26103

Add gradient_checkpointing_segment_size

### Feature request Currently when we enable gradient checkpointing, e.g. in `LlamaModel`, we call `torch.utils.checkpoint.checkpoint` on every `LlamaDecoderLayer`. As per [Training Deep Nets with Su…

zfang updated 7 months ago
4

上一页 1...73 74 75 76 77 78 79...100 下一页

1000+ results for gradient-accumulation

1000+ results
for gradient-accumulation