FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
7.61k stars 552 forks source link

Am I fine-tuning gemma-2b or bge-reranker-v2-gemma? #1019

Open dayuyang1999 opened 3 months ago

dayuyang1999 commented 3 months ago

Dear authors,

Great work, thanks for sharing.

I am trying to fine-tune bge-reranker-v2-gemma using my own dataset.

However, according to the officail finetuning command provided:

torchrun --nproc_per_node {number of gpus} \
-m FlagEmbedding.llm_reranker.finetune_for_instruction.run \
--output_dir {path to save model} \
--model_name_or_path google/gemma-2b \
--train_data ./toy_finetune_data.jsonl \
--learning_rate 2e-4 \
--num_train_epochs 1 \
--per_device_train_batch_size 1 \
--gradient_accumulation_steps 16 \
--dataloader_drop_last True \
--query_max_len 512 \
--passage_max_len 512 \
--train_group_size 16 \
--logging_steps 1 \
--save_steps 2000 \
--save_total_limit 50 \
--ddp_find_unused_parameters False \
--gradient_checkpointing \
--deepspeed stage1.json \
--warmup_ratio 0.1 \
--bf16 \
--use_lora True \
--lora_rank 32 \
--lora_alpha 64 \
--use_flash_attn True \
--target_modules q_proj k_proj v_proj o_proj

Why the model_name_or_path is google/gemma-2b instead of bge-reranker-v2-gemma? So which model am I actually fine-tuning using this code?

I mean I wish my fine-tuned model could further enhance the capability on a certain reranking task on the top of bge-reranker-v2-gemma. Not train gemma-2b from scratch.

545999961 commented 3 months ago

This will fine-tune google/gemma-2b, if you want to fine-tune bge-reranker-v2-gemma, just set model_name_or_path to bge-reranker-v2-gemma

dayuyang1999 commented 3 months ago

This will fine-tune google/gemma-2b, if you want to fine-tune bge-reranker-v2-gemma, just set model_name_or_path to bge-reranker-v2-gemma

Nice, I set model_name_or_path to bge-reranker-v2-gemma.

How about other arguments? Should I keep all other arugments the same?

545999961 commented 3 months ago

This will fine-tune google/gemma-2b, if you want to fine-tune bge-reranker-v2-gemma, just set model_name_or_path to bge-reranker-v2-gemma

Nice, I set model_name_or_path to bge-reranker-v2-gemma.

How about how arguments? Should I keep all other arugments the same?

If you have specific requirements, such as a larger batch size or more negatives, you can modify the other arguments accordingly. Alternatively, you can leave all other arguments at their default settings.

adol001 commented 3 months ago

@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?

545999961 commented 3 months ago

@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?

bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.

dayuyang1999 commented 3 months ago

@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?

bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.

Additional question about the logic of fine-tuning bge-reranker-v2-gemma.

Is bge-reranker-v2-gemma fine-tuned LoRA on the top of google/gemma-2b ?

545999961 commented 3 months ago

@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?

bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.

Additional question about the logic of fine-tuning bge-reranker-v2-gemma.

Is bge-reranker-v2-gemma fine-tuned LoRA on the top of google/gemma-2b ?

Yes.

Ted8000 commented 3 months ago

@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?

bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.

hello, Can I ask when this fine-tuned code will be released? tks

aisen-x commented 3 months ago

我在30万Paper的QD对的数据集上微调bge-reranker-v2-gemma,梯度下降很慢,我现在设置的num_train_epochs=1,我需要多训练几轮嘛,训练进度60% loss从1.5下降到0.98,感觉一批次训练完,loss应该还是会比较高0.7左右,我训练再训练一轮嘛?

ffxmm commented 3 months ago

@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?

bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.

hello, Can I ask when this fine-tuned code will be released? tks

I have the same question too, Please give a note tks