zhuyiche / llava-phi

368 stars 38 forks source link

The link of the model weights is now unavailable. #9

Open BeachWang opened 9 months ago

BeachWang commented 9 months ago

Hi,

I try to reproduce your work with the reported codes. The loss in pretrain and finetune meets expectations. But I get ridiculous answer in eval. For example:

Q: Is there a snowboard in the image? Answer the question using a single word or phrase.

A: 50

So can you upload the model weights to help me debug. Thank you very much!

clarencerat commented 9 months ago

Yeah model weights are not available . Authors of repo please release it , i am excited to see. Screenshot from 2024-02-06 10-57-19

BeachWang commented 9 months ago

I find the link to huggingface is available. But I test it on MME and only get 1110.96 score.

截屏2024-02-06 下午2 10 28
BeachWang commented 9 months ago

Hi,

I try to reproduce your work with the reported codes. The loss in pretrain and finetune meets expectations. But I get ridiculous answer in eval. For example:

Q: Is there a snowboard in the image? Answer the question using a single word or phrase.

A: 50

So can you upload the model weights to help me debug. Thank you very much!

The bug may be in the preprocess_v0 in train.py. It should be

# instruction_len = len(tokenizer(parts[0]).input_ids)
instruction_len = len(tokenizer(parts[0]).input_ids) - 1

, since the last space will be tokenized in parts[0] but be tokenized with other contents in target.

yhcao6 commented 8 months ago

Hi, I try to reproduce your work with the reported codes. The loss in pretrain and finetune meets expectations. But I get ridiculous answer in eval. For example: Q: Is there a snowboard in the image? Answer the question using a single word or phrase. A: 50 So can you upload the model weights to help me debug. Thank you very much!

The bug may be in the preprocess_v0 in train.py. It should be

# instruction_len = len(tokenizer(parts[0]).input_ids)
instruction_len = len(tokenizer(parts[0]).input_ids) - 1

, since the last space will be tokenized in parts[0] but be tokenized with other contents in target.

I also got 1110.96 on MME, have you fixed the performance sir?