RasmussenLab / vamb

Variational autoencoder for metagenomic binning
MIT License
243 stars 44 forks source link

Thread control #45

Open rhysnewell opened 3 years ago

rhysnewell commented 3 years ago

Hi there!

Congrats on your recent nature biotech publication! Very exciting to see VAMB published. I was just wondering if you ever got around to figuring out how to control the number of threads/cores that VAMB uses?

You mention in this issue https://github.com/RasmussenLab/vamb/issues/5 down the bottom that you were having some difficulty with pytorch, but this was in 2018 so perhaps this has been fixed now? I only ask because there does not seem to be an option for it in the command line interface for VAMB.

Cheers, Rhys

jakobnissen commented 3 years ago

Looks like it was fixed on the PyTorch side. Vamb already calls torch.set_num_threads just for good measure, so that part should work by now. Numpy is another problem, see https://github.com/numpy/numpy/issues/11826. The TL;DR is basically that numpy can use a lot of backends, and, unbelievably, they didn't really think to include a way to control the number of threads that work across all backends. The workaround is to set a bunch of environmental variables before importing Numpy and hope you use a backend that reads these env variables. I'll try to reorganize the code to do that.

Finally, you control the number of threads with -p - I can see it's not very clear that this should also control the number of Torch and Numpy threads. I'll change the description of that flag.

rhysnewell commented 3 years ago

Thanks! That's great to know.

I've had similar problem with numpy in the past when perform matrix algebra operations. You have probably tried this but just in case you haven't, I've had success using the threadpoolctl library and wrapping any numpy operation that I know has a backend that uses multiple threads in a:

with threadpoolctl.threadpool_limits(limits=threads, user_api='blas'):
    *** use numpy here ***

But it is likely that this won't catch everything, numpy can be annoying like that sometimes.

SilasK commented 3 years ago

Do I understand this correctly. Vamb is still single-threaded?

simonrasmu commented 2 years ago

@jakobnissen