Open jiinhui opened 3 months ago
For ocrVQA train data, you can refer to this issue, https://github.com/mlpc-ucsd/BLIVA/issues/12. The paper should mention we used a prompt "OCR tokens: {}" to add OCR tokens directly after the question. As for bliva_llava_150k, I think it's the version of converting llava150k to single-turn chat history. Check the details in paper.
For ocrVQA train data, you can refer to this issue, #12. The paper should mention we used a prompt "OCR tokens: {}" to add OCR tokens directly after the question. As for bliva_llava_150k, I think it's the version of converting llava150k to single-turn chat history. Check the details in paper.
I can't find the details about converting llava150k to single-turn chat in your paper. I will try to review InstructBLIP for more details about the dataset.
I can't find the train data files of "BLIVA/bliva/data/llava/bliva_llava_150k.json" and "BLIVA/bliva/data/ocrVQA/cleaned_train_dataset.json". Can you tell me how to download them? Thanks!