ashawkey / RAD-NeRF

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition
MIT License
878 stars 153 forks source link

about side face #4

Closed StawEndl closed 1 year ago

StawEndl commented 1 year ago

hi, bro, used my own training data which contains some side face data to training, then i found some odd output on the result folder.
when the person show front face, the odd things do not appear, but when person turn face, the odd things appear. img_v2_9db5c9ce-eb68-4c28-9fd2-136fa236a97g img_v2_e74f33d6-ad88-467e-846d-78085d4d032g img_v2_4c736732-f790-4ab2-9f6e-32a1b783203g img_v2_860c6d79-8996-45e5-b58d-9399eecb732g

are there some wrong in my dataset? but i found nothing wrong on my data image from video. what should i do next?

ashawkey commented 1 year ago

@StawEndl Hi, this method doesn't perform well for side face (since it's rarely observed in the monocular training video). However, the shadow-like artifacts on the second example is abnormal. How long is your training video?

StawEndl commented 1 year ago

my whole video is 2 min long, this is me part-data-video:

https://user-images.githubusercontent.com/55580043/208333147-420f8958-0c7f-412a-8325-60879d68f0d9.mp4

and this is my output video:

https://user-images.githubusercontent.com/55580043/208333310-30e0954d-1058-44c6-a07c-7f569bf8c8b8.mp4

are there some wrong with my data-video? thanks for your help.

StawEndl commented 1 year ago

an other thing, there are obama.json in README, i just use transforms_train.json which is producted at to inference output. that is OK?

ashawkey commented 1 year ago

@StawEndl The data seems OK, though it's a little short.

  1. Could you check if the semantic segmentation of face is correct? The test video quality is under expectation...
  2. You could also check the GUI and turn the face to side (like obama). To me it seems the depth (z-axis) is not long enough and the face is flat, but I cannot assure.
  3. Yes, the obama.json is in fact the same as transforms_train.json.
StawEndl commented 1 year ago

@StawEndl The data seems OK, though it's a little short.

  1. Could you check if the semantic segmentation of face is correct? The test video quality is under expectation...
  2. You could also check the GUI and turn the face to side (like obama). To me it seems the depth (z-axis) is not long enough and the face is flat, but I cannot assure.
  3. Yes, the obama.json is in fact the same as transforms_train.json.

nice, thanks. i use your model as pretrained model, and that's better. i would like to know, i should run "python main.py data/obama/ --workspace trial_obama/ -O --iters 200000", then run "python main.py data/obama/ --workspace trial_obama/ -O --iters 250000 --finetune_lips", or just run "python main.py data/obama/ --workspace trial_obama/ -O --iters 250000 --finetune_lips"?

ashawkey commented 1 year ago

@StawEndl The two commands should be run sequentially, the second command is a finetuning phase for another 50k steps.

StawEndl commented 1 year ago

@StawEndl The two commands should be run sequentially, the second command is a finetuning phase for another 50k steps.

yes,thanks. i run that two commands, without pretrained model, inference output video is nice, and there are nothing wrong about the face like that which was described at my first question. that must be more perfect, if you write down this in README. otherwise, otherone may be just run one command and get some wrong output just like me. T^T

ashawkey commented 1 year ago

Of course! I have updated the readme. Thanks for the suggestion!