Validation loss on pretraining?[Feature request]

james20141606 commented 1 week ago

feature

Hi, I am trying to redo the pretrain step as you described in the readme doc. The training loss converges pretty fast. I find the logs in wandb and it turned out to be only containing the training loss. I wonder if you could add other metrics, like validation loss and perplexity.

Thanks a lot!

mu-cai commented 4 days ago

Thanks for the question. However, I do not have validation dataset incorporated during training. Feel free to try it by your own!

james20141606 commented 4 days ago

Thanks for your reply! By the way do you have validation data in the finetuning stage?

james20141606 commented 4 days ago

And I have two extra questions which I am confused with:

I tried to pretrain the vip-llava using either your provided data or my custom data and they both converge very fast. The loss plateaus within 5 hrs on one single A100. Does that happen to your experiments as well?
To create a vip-llava model on a specific domain, for example satellite data. Do you think we should pretrain vip-llava using satellite data and then ft with instructions? Or do you think it is enough to load your pretrained checkpoint and ft on custom data? Do you have any intuitions on it? I would appreciate it a lot if you could answer my questions. Thanks!

mu-cai commented 1 day ago

Yes, LLMs's loss decrease very fast.
I think either works, and all of those depends on the quality and quantity of your data!

WisconsinAIVision / ViP-LLaVA

Validation loss on pretraining?[Feature request] #20

feature