mseitzer / pytorch-fid

Compute FID scores with PyTorch.
Apache License 2.0
3.34k stars 506 forks source link

time-consuming of the FID computation #67

Closed LonglongaaaGo closed 3 years ago

LonglongaaaGo commented 3 years ago

time-consuming of the FID computation Hi, I want to know why the FID computation is very slow. When I calculate the FID of 28000 images, it sometimes got stuck and spent almost one day or more to calculate once! Is there any idea to help me fix this problem? Thanks!

LonglongaaaGo commented 3 years ago

BTW, I want to know if fake data numbers and real data numbers are not equal, will it affect the calculation of FID?

mseitzer commented 3 years ago

It is hard to debug this from afar because it can depend on many different factors. In general, I don't see an intrinsic reason why it should get stuck at some points (computation should be equally fast over time). You can install tqdm to get a better overview how the computation progresses. You could also try to profile the code using cProfile.

I don't understand your second question, can you clarify?

LonglongaaaGo commented 3 years ago

Sorry for reply late. I found when I am running the codes. It will get stuck in the "calculate_frechet_distance" can you give me some insights to solve this problem? Thanks again for your consideration!

LonglongaaaGo commented 3 years ago

I am running these codes in the SLURM cluster.

Totoro97 commented 3 years ago

@LonglongaaaGo Hi, I faced the similar problem and I found that the program got stuck in scipy.linalg.sqrtm. Reinstalling Scipy with newest version works for me.

cientgu commented 2 years ago

@LonglongaaaGo Hi, I faced the similar problem and I found that the program got stuck in scipy.linalg.sqrtm. Reinstalling Scipy with newest version works for me.

I update it to 1.7.1, it's still very slow ...

Cerf-Volant425 commented 2 years ago

I also met the same problem ...

Cerf-Volant425 commented 2 years ago

oh then I solved it by changing a virtual environment ...

chensjtu commented 2 years ago

I wonder the right version of scipy

Cerf-Volant425 commented 2 years ago

I wonder the right version of scipy

spicy 1.5.4 worked for me.

yuxu915 commented 2 years ago

I wonder the right version of scipy

spicy 1.5.4 worked for me.

also worked for me. Thanks!

aykborstelmann commented 1 year ago

If anyone else already still has the problem after installing the newest scipy version: In https://github.com/scipy/scipy/issues/14594#issuecomment-906145042 the issue was found by setting the environment variable MKL_NUM_THREADS=1 using export MKL_NUM_THREADS=1

j-cyoung commented 1 month ago

I met the same issue. I have a server with 4 gpus, and I ran 4 programs on each gpu on parallel. Each program will compute FID. I found that all cpus will be occupied when running multi process with FID computing in the same time, but one process ran well. I think there might be some threads conflicts between torch, scipy.linalg.sqrtm (in FID) and multi-process/threads.

Setting "export MKL_NUM_THREADS=1" didn't help. But I found that setting "export OPENBLAS_NUM_THREADS=1" or "export OMP_NUM_THREADS=1" does help for me. So anyone who has the same issue can give a try.

Related Issue: https://github.com/numpy/numpy/issues/8120#issuecomment-252058015 Related Blog: https://pythonspeed.com/articles/concurrency-control/