bugs for fine-tune fsdp multinode

Can you share the training command you used with full arguments, and also provide versions of the following libraries:

accelerate                
bitsandbytes            
datasets                  
hqq                       
hqq-aten              
huggingface-hub 
llama-recipes       
peft                      
safetensors         
tokenizers           
torch                    
transformers

You are likely using an older version of bitsandbytes, quant_storage arg was introduced here: https://github.com/TimDettmers/bitsandbytes/commit/dcfb6f81433e37a8546f7dab3f648eaf858b29ff.

Try pip install -U bitsandbytes and retry. Also for multi-node training make sure each node has the up-to-date bnb version, ideally using same environment across all.

AnswerDotAI / fsdp_qlora

bugs for fine-tune fsdp multinode #26