SamsungLabs / rome

Realistic mesh-based avatars. ECCV 2022
Other
428 stars 41 forks source link

Possible to get the list of video ID's that were used during training? #12

Closed nlml closed 1 year ago

nlml commented 2 years ago

Hi there,

Thanks for sharing this great work!

I am looking into re-implementing the training part of your paper. Regarding VoxCeleb2, you say that:

To address these well-known limitations, we process this dataset using an off-the-shelf image quality analysis model [49] and a 3D face-alignment network [50]. We then filter out the data which has poor quality and non-diverse head rotations. Our final training dataset has ≈ 15000 sequences. We note that filtering/pruning does not fully solve the problem of head rotation bias, and our method still works best in frontal views. For more details, please refer to the supplementary materials.

Would it be possible to have a list of the ~15k sequences that were used for training? This would really make replication much easier!

Or if not, would you be able to provide more specific details as to how the training dataset was filtered using these two algorithms? I couldn't see anything related to this in the supplementary materials.

Thanks again!

khakhulin commented 1 year ago

Hi! Sorry, for hiding the ids, the dataset itself is similar to predecessor works. But, I can say, that the fact that filtering of the head rotation was done, is rather false since the used network has poor quality, I discovered it quite recently, sorry for misleading you. At the same time, if you want to have diversity in terms of the head rotation I suggest using DECA for this purpose. Regarding the IQA, we just use hyperIQA with the customizable threshold. Maybe more details in other works: BiLayer MegaPortraits

nlml commented 1 year ago

Thanks!!