ZhenglinZhou / STAR

[CVPR 2023] STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
157 stars 17 forks source link

Update to use python 3.10 #11

Closed TimSC closed 3 months ago

TimSC commented 1 year ago

Update to use python 3.10, recent torch libraries, plus minor fixes

ZhenglinZhou commented 1 year ago

Hi @TimSC, many thanks for your PR!

But I am worried about the performance difference when changing the torch version. Do you mind evaluating this new version on WFLW?

If you have any questions, feel free to leave a comment or email me (zhouzhenglincs@gmail.com).

TimSC commented 1 year ago

Good point, I didn't evaluate the speed on either pytorch version. I might check my fixes work on python 3.8/pytorch 1.6 using anaconda as well. Do you have a way to properly evaluate the speed? Are we talking about training or testing speed or both?

I'm immediately suspecting my change to _covars.symeig(eigenvectors=True) was not right. There are several related decomposition functions depending on the matrix properties. I just picked the one that was recommended, rather than the fastest one.

TimSC commented 11 months ago

I ran my branch with torch 1.7 and torch 2.0.1. (I could not run torch 1.6 because there was no prebuilt version for CUDA 11, which my GPU requires). It looks like there is a performance difference.

One epoch takes 10 minutes with torch 1.7 and 33 minutes with torch 2.0.1. I was using the WFLW dataset with batch_size=16. Any idea why that might be? (Possibly _covars.symeig or possibly not.)

log-py3.7-torch1.7.txt log-py3.10-torch2.0.1.txt

TimSC commented 11 months ago

I tried various versions of torch and found the performance is consistent between 1.7 and 1.13. However, there is a performance drop moving to torch 2.0. This may be because torch 2 models require torch.compile to be fast but attempting model compilation hits a different error. As far as this PR is concerned, we may as well stick with torch 1.13.

I'm now thinking _covars.symeig is not the cause of the performance problem.

log-py3.7-torch1.7.0-batchsize8-oldeig.txt log-py3.7-torch1.9.0-batchsize8-neweig.txt log-py3.7-torch1.9.0-batchsize8-oldeig.txt log-py3.7-torch1.13.0-batchsize8.txt log-py3.10-torch1.13.1-batchsize8.txt log-py3.10-torch2.0.1-batch8.txt

I switched to batchsize of 8 because some versions of torch were running out of GPU memory.

ZhenglinZhou commented 3 months ago

Hi @TimSC, thank you very much! I believe this update may make STAR more convenient.

ZhenglinZhou commented 3 months ago

Hi, @TimSC. Thanks again! However, I noticed that you are not listed as a contributor for STAR. It's strange. I hope you can be a collaborator of this project. A collaborator invitation has been sent, please kindly accept it. :)