jina-ai / jerboa

LLM finetuning
Apache License 2.0
41 stars 4 forks source link

Add support for deep speed #119

Closed alaeddine-13 closed 1 year ago

JohannesMessner commented 1 year ago

DeepSpeed does not want to run on our GPU machine since the fused_adam op cannot be compiled, neither in JIT nor in pre-compiled mode. I tried various versions of deepspeed and various versions of PyTorch. The only variable I can think of at this point is the cuda/nvvm version that is installed on our machine.

Since we can currently train on an A100 GPU without needing deepspeed, we put this issue on hold.