Why the results is not the same

X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

https://www.modelscope.cn/studios/damo/mPLUG-Owl

MIT License

2.33k stars 176 forks source link

Why the results is not the same #78

Closed kkeyong closed 1 year ago

kkeyong commented 1 year ago

I try to put same input to different platform: 1、modelscrope:https://modelscope.cn/studios/damo/mPLUG-Owl-Bilingual/summary 2、local demo by the weight from https://huggingface.co/MAGAer13/mplug-owl-bloomz-7b-multilingual 3、 hf demo:https://huggingface.co/spaces/MAGAer13/mPLUG-Owl but the results are not the same. The performance rank is：1>3>2. What difference between the weights?

Add： All the config，I set the same: "top_k":1, "top_p":0.1, "num_beams":1, "no_repeat_ngram_size":2, "length_penalty":1, "do_sample":false, "temperature":0.1, "max_new_tokens":512

MAGAer13 commented 1 year ago

Can you describe more detail?

kkeyong commented 1 year ago

Can you describe more detail?

I found NO.2 language model is bloom and NO.1、NO.3 language model is llama. How to use NO.1 model in huggingface？

MAGAer13 commented 1 year ago

The demo on huggingface are trained on video data (No.3). The weights on modelscope No.1 is identical as we provided on the huggingface modelhub see 'MAGAer13/mplug-owl-llama-7b'.

And the bloom checkpoint on modelscope (No.1 and No.2) are same.

kkeyong commented 1 year ago

The demo on huggingface are trained on video data (No.3). The weights on modelscope No.1 is identical as we provided on the huggingface modelhub see 'MAGAer13/mplug-owl-llama-7b'.

And the bloom checkpoint on modelscope (No.1 and No.2) are same.

Oh,I make some mistake，NO.1 link is：https://modelscope.cn/studios/damo/mPLUG-Owl/summary. How to use NO.1 model in huggingface？