Open mjbooo opened 1 year ago
Sorry I have no idea about it, the problem is strange. So far I have tested the code on sm75 and sm86 machines and it works well in all. Also no similar problem has been reported in other repos with gridencoder
, as far as I know. Maybe you can take a test to find which is the step that limited the speed.
Could you take a look at the tqdm bar above? Are you saying your machine processes approximately 351 frames in about 10 ~ 11 seconds, which is equivalent to a frame rate of 35 FPS?
Could you take a look at the tqdm bar above? Are you saying your machine processes approximately 351 frames in about 10 ~ 11 seconds, which is equivalent to a frame rate of 35 FPS?
Normally the inference of only head can reach 34 FPS (the value we reported in the paper, 3080ti). And in situations with torso, it drops a little but not much
Replacing gridencoder
by tiny-cuda-nn encoding should help if the encoder is the problem. Otherwise I guess there must be something wrong apart from the code.
Thank you for your kind help! I'll give it a try
@Fictionarry
The inference speed increased as I allocated more CPU and memory!
However, for those who are struggling with gridencoder, the following bash statement for setting may be helpful!
export TORCH_CUDA_ARCH_LIST=“[YOUR_COMPUTE_CAPABILITY]”
Thank you again for your kind help!!!
@mjbooo Hi! Did you finally fix this?
Hello! I want to express my appreciation for your excellent work. I have a question regarding inference speed.
I recently conducted a test using a 14-second-long audio clip (equivalent to 351 frames) with the Obama video you provided. However, the inference process took approximately 2 minutes, which translates to around 3 frames per second (FPS).
I'm using an A100 GPU, and I've included a list of the installed packages below. But someone mentioned that they were able to achieve an inference speed of 17 FPS using just an RTX 3090.
Furthermore, I followed your instructions to install the packages, | but I encountered an issue with the gridencoder. So I addressed this separately by using the following command to enable support for the sm80 CUDA architecture.
TORCH_CUDA_ARCH_LIST=8.0 pip install ./gridencoder
Do you have any suggestions or insights on how to improve the inference speed?