EvelynFan / FaceFormer

[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
MIT License
790 stars 134 forks source link

Question about inference speed #66

Open JSHZT opened 1 year ago

JSHZT commented 1 year ago

Excellent work! Using vocaset and our own training set based on FLAME, I got better results. But what confuses me is the training speed. If I use your code, the data is all read in before training. If the data set is very large at this time, it will cause the risk of memory explosion. Therefore, I changed that the dataloader of each batch needs to re-read data according to the file path, but in this way, according to the profile statistics, the forward speed is much slower than before. I use RTX 4090 for training and run your code. The average 2s a batch can be completed, but it takes more than 8s on my side. Do you choose to read the data before training is the solution to this problem? And, have you ever discussed this issue?Looking forward to your reply to this issue and suggestions for improvement