ContextualAI / gritlm

Generative Representational Instruction Tuning
https://arxiv.org/abs/2402.09906
MIT License
479 stars 33 forks source link

QLora + Mistral #29

Closed raghavlite closed 2 months ago

raghavlite commented 2 months ago

I was trying qlora with mistral and I get this error.

    -m training.run \
    --output_dir /usr/project/xtmp/rt195/Sentence_Embedding/F5/gritlm/data/m7_temp \
    --model_name_or_path mistralai/Mistral-7B-v0.1 \
    --train_data /usr/project/xtmp/rt195/Sentence_Embedding/F5/gritlm/data/MEDI2/allnli.jsonl \
    --learning_rate 2e-5 \
    --lr_scheduler_type linear \
    --warmup_ratio 0.03 \
    --max_steps 1253 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 5 \
    --dataloader_drop_last \
    --normalized \
    --temperature 0.02 \
    --train_group_size 2 \
    --negatives_cross_device \
    --query_max_len 256 \
    --passage_max_len 2048 \
    --mode embedding \
    --logging_steps 1 \
    --bf16 \
    --pooling_method mean \
    --attn cccc \
    --attn_implementation sdpa \
    --save_steps 5000 \
    --gradient_checkpointing \
    --qlora
Screen Shot 2024-04-19 at 4 28 59 PM

if i change the command to --lora instead of --qlora, it works.

Any help is appreciated. Also, what version of peft and bitsandbytes are you using?

Muennighoff commented 2 months ago

Hm I havn't tried the lora & qlora integrations

raghavlite commented 2 months ago

thanks