philschmid deep-learning-pytorch-huggingface issues

philschmid / deep-learning-pytorch-huggingface

MIT License

570 stars 133 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Quantization question:

#56 aptum11 opened 2 weeks ago
0
Not able to run training/fsdp-qlora-distributed-llama3.ipynb

#55 aasthavar closed 2 weeks ago
7
St

#54 philschmid closed 1 month ago
0
Clean up some typos; simplify some code; update some comments

#53 tomaarsen closed 1 month ago
0
Deprecation warnings.

#52 hohoCode opened 1 month ago
0
Fine-tune-llm-in-2024-with-trl.ipynb not producing the outputs

#51 scigeek72 opened 1 month ago
0
Fsdp qlora

#50 philschmid closed 2 months ago
0
Out of Memory: Cannot reproduce T5-XXL run on 8xA10G.

#49 slai-natanijel opened 3 months ago
3
What's the use of "messages" in dpo step?

#48 katopz opened 3 months ago
0
question about DeepSpeedPeftCallback

#47 mickeysun0104 opened 4 months ago
0
Gemma

#46 philschmid closed 4 months ago
0
Re. fine-tune-llms-in-2024-with-trl.ipynb

#45 andysingal opened 4 months ago
1
Dpo

#44 philschmid closed 4 months ago
0
Target modules all-linear not found in the base model.

#43 kassemsabeh closed 5 months ago
6
Commit Version Bug Fix

#42 YanSte closed 5 months ago
1
Trl

#41 philschmid closed 5 months ago
0
flash attention error on instruction tune llama-2 tutorial on Sagemaker notebook

#40 matthewchung74 opened 8 months ago
2
Precision Issue

#39 zihaohe123 opened 9 months ago
4
Falcon-180B "forward() got an unexpected keyword argument 'position_ids'"

#38 aittalam opened 9 months ago
0
Does this work for Llama2 - Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA & Flash Attention?

#37 ibicdev opened 9 months ago
11
Ds lora

#36 philschmid closed 9 months ago
0
Instruction tuning of LLama2 is significantly slower compared to documented 3 hours fine-tuning time on A10G.

#35 mlscientist2 opened 9 months ago
1
Compute metrics while using SFT trainer

#34 shubhamagarwal92 opened 9 months ago
1
Cannot load tokenizer for llama2

#33 smreddy05 closed 9 months ago
1
LLama 2 Flash Attention Patch Not Working For 70B

#32 mallorbc opened 10 months ago
6
Gptq

#31 philschmid closed 10 months ago
0
Fix Flash Attention forward for Llama-2 70b

#30 davidmrau closed 10 months ago
8
Is the DataCollator necessary in peft-flan-t5-int8-summarization.ipynb ?

#29 brooksbp opened 10 months ago
0
question about the llama instruction code

#28 yeontaek closed 10 months ago
8
improvements

#27 philschmid closed 11 months ago
0
Llama patch for FlashAttention support fails with use_cache

#26 qmdnls opened 11 months ago
2
How to create a json file for create_flan_t5_cnn_dataset.py

#25 andysingal opened 11 months ago
1
gcc/cuda used for training

#24 danyaljj opened 11 months ago
1
fix

#23 philschmid closed 11 months ago
0
Flash attention

#22 philschmid closed 11 months ago
4
Falcon int4

#21 philschmid closed 11 months ago
0
added container image for training

#20 philschmid closed 1 year ago
0
CPU offload when not using offload deepspeed config file

#19 siddharthvaria opened 1 year ago
3
Error when training peft model example

#18 Tachyon5 opened 1 year ago
6
Colab notebook fails

#17 TzurV closed 1 year ago
1
CUDA OOM error while saving the model

#16 aasthavar closed 1 year ago
10
Does deepspeed partition the model to multi GPUs?

#15 vikki7777 opened 1 year ago
4
ValueError

#14 Martok10 opened 1 year ago
4
Inference on CNN validation set takes 2+ hours on p4dn.24xlarge machine with 8 A100s, 40GB each

#13 sverneka opened 1 year ago
5
FLAN-T5 XXL using DeepSpeed fits well for training but gives OOM error for inference.

#12 irshadbhat opened 1 year ago
2
Sample inference script for FLAN-T5 XXL using DeepSpeed & Hugging Face.

#11 irshadbhat closed 1 year ago
7
Error when finetuning Flan-T5-XXL on custom dataset

#10 ngun7 opened 1 year ago
1
Peft flan

#9 philschmid closed 1 year ago
0
fix small bugs of deepseed-flan-t5-summarization.ipynb

#8 yao-matrix closed 1 year ago
2
Error (return code -7) when finetuning FLANT5-xxl on 8* A100

#7 scofield7419 opened 1 year ago
3