haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
19.15k stars 2.1k forks source link

[Question] DAPT(Domain-Adaptation Pretraining) method #777

Open unmo opened 10 months ago

unmo commented 10 months ago

Question

Is it possible to do Domain Adaptation instead of Task Adaptation? Specifically, I want to use LLaVA as a starting checkpoint to train the language and images of the new domain. thank you.

unmo commented 9 months ago

Is it impossible or is there some way to do DAPT.

haotian-liu commented 9 months ago

I guess LLaVA-Med can be considered as DAPT?

unmo commented 9 months ago

Stage 1 is probably what it is. image

So can we use LLaVA 1.5(not vanila Vicuna) as a starting point for pre-training? If so, would the procedure be to generate mm_projector.bin and use that to run the finetune.sh?

unmo commented 9 months ago

This example appears to be done from pre-training using a large amount of data for domain information.

I only have a small amount of domain data and I am not sure how it should be trained in that case.

Can you please tell me what to do?