warmshao / FasterLivePortrait

Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
487 stars 46 forks source link

How To Enable Full Precision FP32 in Web Demo app.py? #29

Open MarsEverythingTech opened 2 months ago

MarsEverythingTech commented 2 months ago

Hello,

I have successfully installed the one-click installer, and would like to use the full precision fp32 on the web demo app.py How can I so it?

Thanks in advance

warmshao commented 2 months ago

Hello,

I have successfully installed the one-click installer, and would like to use the full precision fp32 on the web demo app.py How can I so it?

Thanks in advance

Using FP32 leads to occupying more VRAM and being relatively slower. If you really want to use FP32, you need to append -p fp32 to the end of every line of code in the all_onnx2trt.bat file to perform the conversion again with FP32 precision.

MarsEverythingTech commented 2 months ago

-p fp32

Alright, Thanks. I am using RTX 3070 Ti 8 GB VRAM. I saw that the FP32 better because it is very precise and produces more realistic results. I am also running it on CUDA 11.8. Do I need CUDA 12.2? Thanks again

UPDATE; When I was processing, my PC shut down by itself. Why?

warmshao commented 2 months ago

I'm not sure if it can be adapted to CUDA 11.8, you can give it a try, if it doesn't work, you might need to install CUDA 12.2. Perhaps it's because fp32 has filled up your VRAM causing the shutdown, by the way, have you tried running with fp16?

Echolink50 commented 2 months ago

Where has it been shown that fp32 gives better results?

MarsEverythingTech commented 2 months ago

I'm not sure if it can be adapted to CUDA 11.8, you can give it a try, if it doesn't work, you might need to install CUDA 12.2. Perhaps it's because fp32 has filled up your VRAM causing the shutdown, by the way, have you tried running with fp16?

Well, it was working fine with CUDA 11.8 and fp16 which is the default installation (by just clicking the .bat file to convert). However, when I edited it by appending -p 32 at each line as you suggested. My PC shut down by itself half through the processing of a video. My VRAM usage was very low, so I don't think it was the cause.

MarsEverythingTech commented 2 months ago

Where has it been shown that fp32 gives better results?

The difference is minimal, but it can be noticeable if you keep replaying the same generated result and compare it to the fp16. For example, the teeth alignment will be more realistic using fp32, whereas in fp16, the teeth alignment changes from frame to frame

warmshao commented 2 months ago

I'm not sure if it can be adapted to CUDA 11.8, you can give it a try, if it doesn't work, you might need to install CUDA 12.2. Perhaps it's because fp32 has filled up your VRAM causing the shutdown, by the way, have you tried running with fp16?

Well, it was working fine with CUDA 11.8 and fp16 which is the default installation (by just clicking the .bat file to convert). However, when I edited it by appending -p 32 at each line as you suggested. My PC shut down by itself half through the processing of a video. My VRAM usage was very low, so I don't think it was the cause.

Enter this command: .\venv\python.exe scripts\onnx2trt.py -o .\checkpoints\liveportrait_onnx\warping_spade-fix.onnx -p fp32, and show me what happened?