fudan-generative-vision / hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
https://fudan-generative-vision.github.io/hallo/
MIT License
9.24k stars 1.27k forks source link

显存恒定9G左右 #165

Open Tanwei99 opened 2 months ago

Tanwei99 commented 2 months ago

+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.179 Driver Version: 535.179 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA RTX A5000 Off | 00000000:65:00.0 Off | Off | | 65% 85C P2 228W / 230W | 9721MiB / 24564MiB | 100% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 13160 C python3 9716MiB | +---------------------------------------------------------------------------------------+

该状态下平均30帧/每秒 约2分钟,10秒的视频要20分钟以上,提升显存使用率能否降低时间?

piwawa commented 2 months ago

我用 A100 推理也是稳定占用 9G 显存,输出视频25fps,512分辨率,每秒音频推理2分钟,15s就是半小时,巨慢无比!

jsyqrt commented 5 days ago

太慢了