Closed Hisarlik closed 3 months ago
Does it work without qlora/lora? This seems like a problem with labels being passed to forward - what attn
are you using?
I've tried cccc, bb and cc. I've tried with tiny-mistral and other models like gemma 2b.
torchrun --nproc_per_node 1 -m training.run --output_dir test_path --model_name_or_path openaccess-ai-collective/tiny-mistral --train_data training/toy_data/toy_data_embedding.jsonl --learning_rate 1e-5 --num_train_epochs 5 --per_device_train_batch_size 2 --dataloader_drop_last True --normalized True --temperature 0.02 --query_max_len 32 --passage_max_len 128 --train_group_size 2 --mode embedding --attn cccc --lora True
Does it work without qlora/lora?
yes, sorry. It works without lora
I think it is because LoRA rewraps the model and then the label kwarg is not passed through or something; I don't have time to debug it atm but it shouldn't be too complicated - would be amazing if you open a PR in case you fix it :)
Of course. I'll give it a try. Congratulations again for your work.
I have created a PR: https://github.com/ContextualAI/gritlm/pull/21
I have tested with the embedding example. I haven't found the reason for this behaviour in the PEFT library. By removing the specific task parameter the training works.
Hello, you have done an incredible job. I've read in the README file that QloRa and loRa integration is not well-tested.
I tried the getting started example for embeddings adding qlora and lora and the result is always:
TypeError: GemmaModel.forward() got an unexpected keyword argument 'labels'
I've tried other models and the output is the same.