MoyGcc / vid2avatar

Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition (CVPR2023)
https://moygcc.github.io/vid2avatar/
Other
1.23k stars 100 forks source link

Setting up training dataset #51

Open leobcc opened 1 year ago

leobcc commented 1 year ago

I am having trouble understanding how to set up the dataset, in particular the training data in the folders. More specifically, I would like to reproduce the results on the 3DPW dataset, which you mention in your paper. I have downloaded the dataset and extracted the imageFiles and the sequenceFiles. I have been trying to figure out from the code how the dataloader is set up, but I have some doubts. Before wasting time on some wrong solution to train the model over this dataset, I though I may ask for some clarifications.

Would you mind give some more insights on how to proceed? (where exactly to put the folders, whether or not to leave them separated in train validation and test, if running the preprocessing is needed for this dataset)

Thank you in advance, I hope the question is not too obvious or out of place.

MoyGcc commented 1 year ago

Hi, it doesn't matter about the 3DPW split as our method is not a generalizable human reconstruction method but a purely video-based reconstruction method meaning that we need a full video sequence as input. Considering the computational time, we evaluated the outdoors_fencing_01 sequence and compared it with other baselines.

No specific steps need to be taken but just treat the video as an internet video and follow the listed steps https://github.com/MoyGcc/vid2avatar#play-on-custom-video to run training. Note that the provided reference poses in 3DPW are not always accurate. It's recommended to do the evaluation on the provided SynWild dataset.

leobcc commented 11 months ago

Hello, thank you for the clarification. I have obtained access to the SynWild dataset from the official page, as you suggested, in order to do some quantitative evaluation over it. I have been given 5 different videos. In order to be consistent with the evaluation you presented, I am guessing you considered an average of the considered metrics obtained over the different videos, is that correct?

MoyGcc commented 11 months ago

Yes, it's an average number over all sequences.