VITA-Group / Diffusion4D

"Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models", Hanwen Liang*, Yuyang Yin*, Dejia Xu, Hanxue Liang, Zhangyang Wang, Konstantinos N. Plataniotis, Yao Zhao, Yunchao Wei
https://vita-group.github.io/Diffusion4D/
232 stars 3 forks source link

Questions on curated high-quality subsets and their captions #11

Closed 1Konny closed 2 months ago

1Konny commented 2 months ago

Hi! Thanks for your great work and contribution! Your provided list and metadata from Objaverse-XL are indeed helpful and I'm confident that you're saving a lot of researchers trying to do some projects using Objaverse-XL!

I have two questions regarding the curated high-quality subsets and their corresponding captions.

In the paper, it is mentioned that 54K high-quality animated assets are finally chosen for training the models. Where can I get the list of the curated objects? It seems to me like meta_xl_animation_tot.csv and meta_xl_tot.csv are the lists of all animated objects from objaverse-xl and all successfully rendered objects from objaverse-xl's github subset. Also, I find this (https://github.com/VITA-Group/Diffusion4D/blob/main/rendering/src/ObjV1_curated.txt) is the curated object list from objaverse, consisting of 11K objects.

Also, it is stated that captions from Cap3D are used for training. However, I find that the uids from Cap3D are 32-length long and the 64-length long SHA256 strings you provided in csv files do not match. Can you guide me a good way to get the right captions for your curated lists?

Thanks!

(This is a duplicate post from https://huggingface.co/datasets/hw-liang/Diffusion4D/discussions/13)

hw-liang commented 2 months ago

Hi, thank you very much for your interest and your recognition of our work.

Your understanding of the meta_xl_animation_tot.csv and meta_xl_tot.csv is correct. They are the ids of all the successfully rendered assets from obj-xl dataset. The curated v1 is saved at (rendering/src/ObjV1_curated.txt) and https://huggingface.co/datasets/hw-liang/Diffusion4D/tree/main/objaverse1.0_curated. The curated xl can be found at https://huggingface.co/datasets/hw-liang/Diffusion4D/tree/main/objaverseXL_curated. We also do not find the id correspondence of obj-xl dataset in Cap3D. Thus for text-to-4D, we used assets in obj-v1 and their captions in Cap3D in training.

1Konny commented 2 months ago

Thank you very much for the clarification!