Fictionarry / ER-NeRF

[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
https://fictionarry.github.io/ER-NeRF/
MIT License
1.07k stars 137 forks source link

Audio features extract is so slow. #143

Open iioSnail opened 6 months ago

iioSnail commented 6 months ago

Audio features are extracted from a pretrained DeepSpeech model. However, this step is too slow. The 5 seconds wav file need to speed 10 seconds for extracting npy file.

Are there some methods can make it faster?

hnsywangxin commented 2 months ago

I have the same problem