wyhsirius / g3an-project

[CVPR 20] G3AN: Disentangling Appearance and Motion for Video Generation
https://wyhsirius.github.io/G3AN/
MIT License
38 stars 8 forks source link

Pre/post processing for Weizmann dataset #7

Closed artikeshari closed 2 years ago

artikeshari commented 2 years ago

Hi,

Can you share the detail about the pre/post-processing that you applied to put qualitative results for the Weizmann dataset, in the paper? It seems you have cropped the frames to enlarge the person in the center.

wyhsirius commented 2 years ago

@sampriti111 Hi, I used the preprocessed data from MoCoGAN

artikeshari commented 2 years ago

Mocogan says they have used 96x96 reshaping for the Weizmann dataset and you are using 64x64. I used preprocessing as mentioned in your paper; scale each frame to 85 × 64 and crop the central 64 × 64 regions. But processed frames shape does not look like frames that you have added to the paper. I just wanted to know if you have used some post-processing on frames to put in the paper.

wyhsirius commented 2 years ago

@sampriti111 I didn't use any post-processing. I used the data provided by mocogan on its github

artikeshari commented 2 years ago

I misunderstood preprocessing part of the paper. Thanks btw.

artikeshari commented 2 years ago

Hey, I have one more doubt. You have mentioned that "The Weizmann Action dataset consists of videos of 9 subjects, performing 10 actions such as wave and bend." But, the GitHub repo of MoCoGAN only has 4 categories- bending, jumping jack, one hand waving, and two hands waving performed by 9 subjects, which means 36 original +36 flipped videos.

If you are downloading the data from somewhere else please let me know.

wyhsirius commented 2 years ago

@sampriti111 Hi, this is true. In our experiments, we only used the 4 categories as MoCoGAN to have a fair comparison.