Closed YangYangGirl closed 1 year ago
Thank you for your interest in our work. We find that the orignial captions from various vision foundation models can be noisy (that's why we use human annotation), so we may not release the original captions of this version.
Congrats on the excellent work. Can you release the original captions from which you generate questions? It will be helpful to the community.