GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
https://huggingface.co/spaces/ethanchern/Anole
666 stars 36 forks source link

The effect is very poor after training #31

Open coder4nlp opened 3 months ago

coder4nlp commented 3 months ago

I want to train the model on Chinese dialogue ability, I use Chinese caption data to pre-train on anole basis, lr=1e-5, but at inference time, the answer is complete nonsense, excuse me, where should I look for questions, loss drops normally.

wxliii commented 2 months ago

Hi! Have you solved this problem?

coder4nlp commented 2 months ago

Hi! Have you solved this problem?

No, I don't know why.

wxliii commented 2 months ago

Hi! Have you solved this problem?

No, I don't know why.

Have you experimented with different learning rates? I noticed that the learning rate you are using is different from the one set in the code.

xinlong-yang commented 2 months ago

Do you update the full parameters? I find that the train.py in the repo will do full fine-tuning, that means we will update the model.vqmodel, but when trans .bin to .pth to do inference, it ignores the model.vqmodel, so there is difference about vqmodel between training and infernece.

xinlong-yang commented 2 months ago

Do you update the full parameters? I find that the train.py in the repo will do full fine-tuning, that means we will update the model.vqmodel, but when trans .bin to .pth to do inference, it ignores the model.vqmodel, so there is difference about vqmodel between training and infernece. @coder4nlp @wxliii