Release Upper Limit on Torch, Transformers, Accelerate

foundation-model-stack / fms-acceleration

🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.

Apache License 2.0

0 stars 4 forks source link

Release Upper Limit on Torch, Transformers, Accelerate #17

Open fabianlim opened 1 month ago

fabianlim commented 1 month ago

Currently the torch dependency in framework is upper bounded as "< 2.3", however in accelerate versions has problems supporting torch 2.2. The latest numpy versions (>=2.0) also has incompatibilities with the current torch version and is bounded here in #42. Hence, we should consider releasing the upper bound soon.

Also can consider releasing the upper limit on transformers and accelerate

fabianlim commented 1 week ago

we found out there is really no need to upper bound this torch dependency, as for us we are getting stuck only beause of this commit https://github.com/pytorch/pytorch/pull/121635.

~so the workaround for us can be just to downgrade to nvidia-nccl-cu12==2.19.3~ Update: This is due to a NCCL_BUFFSIZE wrong setting.

Also for transformers, we just have to be weary of the sliding window mask issue of SPDA, and keep track of it to see when it will be fixed.

fabianlim commented 2 days ago

FMS has fixed the TRL issue https://github.com/foundation-model-stack/fms-hf-tuning/pull/213

fabianlim commented 22 hours ago

I think we need a lower limit on the bitsandbytes version that supports quant_storage. I have encourted that 0.41 didnt work, but 0.43 is ok