Closed ZCMax closed 6 months ago
For evaluating on datasets such as Multi3DRefer and SQA3D, we add them (train split) into the training data. And in v2.1, we train the model in one single joint-training stage, so we delete some unnecessary data for object-level/scene-level alignment proposed in our paper. There leaves a large space for exploring to include more high-quality datasets during training.
So you remove the generated datasets in v2.1? only using the existing human-annotated 3d-vl datasets?
Yes.
Thanks for your great work! May I know whether you have done any dataset updation for the version 2.1?