Check out our datasets, I think they might be useful for training models like this.

wendlerc commented 8 months ago

We created some large-scale multimodal datasets that contain OCR annotations, for some we ran paddle OCR over LAION images

do you think those might be useful to tune your method?

Best, Chris

HAWLYQ commented 3 months ago

Hi, @wendlerc, great work! we will consider utilizing these datasets in our next work!

wendlerc commented 3 months ago

Let me know when you need access to the laion datasets, I set them to private for now.

X-PLUG / mPLUG-DocOwl