AliaksandrSiarohin / monkey-net

Animating Arbitrary Objects via Deep Motion Transfer
467 stars 81 forks source link

What's the difference between vox-full.yaml and vox.yaml? #16

Open subin6 opened 4 years ago

subin6 commented 4 years ago

I want to train your model with VoxCeleb dataset. Which configuration do I use between them?

Also, could you provide VoxCeleb pretrain model?

Thank you

AliaksandrSiarohin commented 4 years ago

Hi, The difference between the vox-full.yaml and vox.yaml is the fact that vox.yaml run keypoint detector and dense motion on downscaled 64x64 images. On the other hand vox-full train all networks on 256x256. I suggest to use vox.yaml because it is faster, also you can download VoxCeleb dataset with my preprocessing (difference with original is that face aspect ratio is preserved, backround is not moving, and all the small resolution videos is removed) (https://github.com/AliaksandrSiarohin/video-preprocessing).

Here there is a https://yadi.sk/d/A0Jq_01xiXky3g which I use to compare with my newer model. The corresponding config is vox.yaml. The movement there is quite limited compared to newer model, here is the example: id10289#Rn0Z_lIiL1w#00001.txt#000.mp4-id10286#9K2YB1d8BqY#00008.txt#000.mp4.png.mp4.zip

By columns: source,driving,new model,monkey-net,x2face

kashi211 commented 4 years ago

i'm sorry but is the "new" model you're reffereing to is 'vox-full.yaml' and could you provide me with the cpkt file?

AliaksandrSiarohin commented 4 years ago

No. It is https://github.com/AliaksandrSiarohin/first-order-model But I can not provide a checkpoints yet, because of some privacy related issues.

JialeTao commented 3 years ago

Hi, The difference between the vox-full.yaml and vox.yaml is the fact that vox.yaml run keypoint detector and dense motion on downscaled 64x64 images. On the other hand vox-full train all networks on 256x256. I suggest to use vox.yaml because it is faster, also you can download VoxCeleb dataset with my preprocessing (difference with original is that face aspect ratio is preserved, backround is not moving, and all the small resolution videos is removed) (https://github.com/AliaksandrSiarohin/video-preprocessing).

Here there is a https://yadi.sk/d/A0Jq_01xiXky3g which I use to compare with my newer model. The corresponding config is vox.yaml. The movement there is quite limited compared to newer model, here is the example: id10289#Rn0Z_lIiL1w#00001.txt#000.mp4-id10286#9K2YB1d8BqY#00008.txt#000.mp4.png.mp4.zip

By columns: source,driving,new model,monkey-net,x2face

Hi, noted that the datasets of monkey-net and your new model used are diffrent, inculding different vesions of taichi and vox. I'm making a comparison with monkey-net and FOMM on the dataset provided in FOMM, could you provide the monkey-net checkpoints of vox, taichi and bair, that you used to compare with your new model ? Thanks a lot.

AliaksandrSiarohin commented 3 years ago

Probably they are here https://drive.google.com/file/d/1IJ6sU5ynTKga4YdaG9PaW7lz-Zn4BBjh/view?usp=sharing