ashawkey / RAD-NeRF

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition
MIT License
877 stars 153 forks source link

Real time - random audio #18

Open deema-A opened 1 year ago

deema-A commented 1 year ago

Hi, I trained the model, then:

python test.py --pose data/obama.json --ckpt pretrained/obama_eo.pth --aud data/intro_eo.npy --workspace trial_obama/ -O --torso

With new random audio, the video was not predicting the lip sync correctly. Best,

ashawkey commented 1 year ago

@deema-A Hi, have you changed the --aud to your processed audio features? Could you post a video showing the new audio and synthesized lips?

iboyles commented 1 year ago

Do you mind sharing how it could be real time with having to process audio features each time before running inference? Also when using test.py for inference it takes about 2x real time to run the step before displaying. Is it just really beefy hardware needed?