CASIA-LM / MoDS

96 stars 9 forks source link

a Chinese language question #3

Open nuoma opened 6 months ago

nuoma commented 6 months ago

Hi, If I would like to apply the diversity selection methodology on Chinese SFT dataset (let's say Alpaca-cn-gpt4), can I simply change the model to a chinese-bert(https://huggingface.co/bert-base-chinese/tree/main or https://huggingface.co/hfl/chinese-bert-wwm-ext/tree/main). Is this the correct way?

stainswei commented 6 months ago

I try to use moss-rlhf-reward-model-7B-zh as reward model to complete quality evaluation on chinese dataset. But I found that result is not good. May i ask which chinese reward model is useful?

nuoma commented 6 months ago

I'm not quiet sure about the quality eval part. I know this is incorrect but still ran the original english reward model and got results like this. I'm assuming moss is not a very high quality chinese reward model.

Image

MAxx8371 commented 4 months ago

I'm not quiet sure about the quality eval part. I know this is incorrect but still ran the original english reward model and got results like this. I'm assuming moss is not a very high quality chinese reward model.

Image

@nuoma The max_length of reward-model-deberta-v3-large-v2 is 512, but the length of some samples (Alpaca-cn-gpt4) is much longer than that. Dose that have an impact on the quality eval part?

nuoma commented 4 months ago

I'm not quiet sure about the quality eval part. I know this is incorrect but still ran the original english reward model and got results like this. I'm assuming moss is not a very high quality chinese reward model. Image

@nuoma The max_length of reward-model-deberta-v3-large-v2 is 512, but the length of some samples (Alpaca-cn-gpt4) is much longer than that. Dose that have an impact on the quality eval part?

my guess is yes, it definitely will impact the quality eval.

MAxx8371 commented 4 months ago

@nuoma Thank you for your response. So in your exp, you ran the reward model without any processing of the input length? Does this method (MoDS) work well in your scenario?

nuoma commented 4 months ago

@nuoma Thank you for your response. So in your exp, you ran the reward model without any processing of the input length? Does this method (MoDS) work well in your scenario?

My apologies, but I didn't perform detailed experiment on this issue. Main reason is that Chinese language doesn't have much reward models to choose from, only https://huggingface.co/Ablustrund/moss-rlhf-reward-model-7B-zh , but a quick experiment didn't give me satisfying results (subjective evaluation, no supporting evidence)

Haozhe-Xing commented 3 months ago

@stainswei same question, have you find a better chinese reward model

Go4miii commented 2 months ago

Hi, If I would like to apply the diversity selection methodology on Chinese SFT dataset (let's say Alpaca-cn-gpt4), can I simply change the model to a chinese-bert(https://huggingface.co/bert-base-chinese/tree/main or https://huggingface.co/hfl/chinese-bert-wwm-ext/tree/main). Is this the correct way?

请问您尝试这个支持中文的模型后,您感觉效果怎么样?

Go4miii commented 2 months ago

Hi, If I would like to apply the diversity selection methodology on Chinese SFT dataset (let's say Alpaca-cn-gpt4), can I simply change the model to a chinese-bert(https://huggingface.co/bert-base-chinese/tree/main or https://huggingface.co/hfl/chinese-bert-wwm-ext/tree/main). Is this the correct way?

或者您是否找到一个合适的支持中文的模型呀?

nuoma commented 1 month ago

这俩主观结果是ok的,能够满足多样性要求。换成目前比较流行的embedding模型也是不错的选择,不过我没有实践。只要能够准确的将文字转化成向量即可

On Tue, May 7, 2024 at 12:53 Yolk @.***> wrote:

Hi, If I would like to apply the diversity selection methodology on Chinese SFT dataset (let's say Alpaca-cn-gpt4), can I simply change the model to a chinese-bert(https://huggingface.co/bert-base-chinese/tree/main or https://huggingface.co/hfl/chinese-bert-wwm-ext/tree/main). Is this the correct way?

或者您是否找到一个合适的支持中文的模型呀?

— Reply to this email directly, view it on GitHub https://github.com/CASIA-LM/MoDS/issues/3#issuecomment-2097443409, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGUTFD4M3BCPSMFXMFMGGTZBBM47AVCNFSM6AAAAABAQ4NTL2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXGQ2DGNBQHE . You are receiving this because you were mentioned.Message ID: @.***>