cggh / scikit-allel

A Python package for exploring and analysing genetic variation data
MIT License
287 stars 49 forks source link

Error. nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" #285

Open ddjukic opened 5 years ago

ddjukic commented 5 years ago

Hi,

allel.vcf_to_hdf5() fails to convert any .vcf file to .h5, throwing error: 'Error. nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)'; seems to be numexpr related?

installed through conda in a fresh environment, only the dependencies (and their dependencies) are also installed. The code is supposed to be running on a cluster (I noticed that issue #125 removed the numexpr dependency, not sure whats going on here, I am on the 1.2.1 version of scikit-allel), if that might matter.

tried adjusting the number of threads with 'os.environ['NUMEXPR_MAX_THREADS'] =' with no results; any idea what might be causing the problem?

Thanks, Best

alimanfoo commented 5 years ago

Hi Dejan, that is strange, haven't seen that before. Do you get the same error when trying vcf_to_zarr()? Unfortunately about to go on leave for a couple of weeks so apologies in advance for any radio silence.

On Tue, 13 Aug 2019 at 00:55, Dejan Đukić notifications@github.com wrote:

Hi,

allel.vcf_to_hdf5() fails to convert any .vcf file to .h5, throwing error: 'Error. nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)'; seems to be numexpr related?

installed through conda in a fresh environment, only the dependencies (and their dependencies) are also installed. The code is supposed to be running on a cluster (I noticed that issue #125 https://github.com/cggh/scikit-allel/issues/125 removed the numexpr dependency, not sure whats going on here, I am on the 1.2.1 version of scikit-allel), if that might matter.

tried adjusting the number of threads with 'os.environ['NUMEXPR_MAX_THREADS'] =' with no results; any idea what might be causing the problem?

Thanks, Best

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cggh/scikit-allel/issues/285?email_source=notifications&email_token=AAFLYQRGOJWEGWTTCM5L6LTQEHZ5LA5CNFSM4ILF5DSKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HE2V75A, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFLYQRCNQC2D7T246VGTSTQEHZ5LANCNFSM4ILF5DSA .

--

Alistair Miles Head of Epidemiological Informatics Centre for Genomics and Global Health Big Data Institute Li Ka Shing Centre for Health Information and Discovery University of Oxford Old Road Campus Headington Oxford OX3 7LF United Kingdom Phone: +44 (0)1865 743596 or +44 (0)7866 541624 Email: alimanfoo@googlemail.com Web: http://a http://purl.org/net/alimanlimanfoo.github.io/ Twitter: @alimanfoo https://twitter.com/alimanfoo

Please feel free to resend your email and/or contact me by other means if you need an urgent reply.

ddjukic commented 5 years ago

Hi Alistair,

Thanks for a very prompt reply; Actually, it seems that the error triggers upon importing the package itself Capture

The error aborts the execution of my nextflow script but the functionality of the module seems to be preserved when I work with it interactively; I'll try to find a workaround or somehow catch the error message (which seems to get displayed upon a successful import somehow?).

Yeah, this probably doesn't need your full attention either way; thank you for letting me know, hope you have a good vacation (if it is a vacation you are going for)!

alimanfoo commented 5 years ago

Thanks Dejan. FWIW I found this issue https://github.com/pydata/numexpr/issues/322 which suggests numexpr may be being imported via pandas, although that issue is still open.

robbmcleod commented 5 years ago

@ddjukic have you tried the release of NumExpr 2.7.0?

silastittes commented 5 years ago

I did as suggested here. I still get the error when setting the variable in the shell, but avoided it by setting the environmental variable inside the Python script: import os os.environ["NUMEXPR_MAX_THREADS"]="272" import allel

robbmcleod commented 5 years ago

Which version of NumExpr are you running?

silastittes commented 5 years ago

Numexpr version: 2.7.0

hermannschwaerzlerUIBK commented 3 years ago

Hi everybody,

this issue is quite old but as we are encountering it as well I want to share the workaround we found.

As far as I can tell the problem seems to be that allel "tells" numexpr to use all visible cores of the computer it's running on. If the computer has more than NUMEXPR_MAX_THREADS (which defaults to 64) cores, numexpr will raise this error. In our case we try to use allel on a machine with more than 1700 cores of which not all are usable for us (as this system is using a batch scheduler).

Our workaround is using the sysconfcpus utility to "trick" allel into "thinking" there is a certain lower number of cores. E.g. to make 4 cores visible we run

sysconfcpus -n 4 python

Now importing os, numexpr and allel in this works without an error message.

Maybe this helps to narrow down the search for the underlying problem?

Regards, Hermann

hermannschwaerzlerUIBK commented 3 years ago

I took the time to look deeper into this and found the underlying problem and a (quick and dirty) solution:

The real problem is that during the initialisation steps of import allel the detect_number_of_cores of bcolz is called. This function ignores CPUSets (and any other way of limiting the cores available to a process). As in our system CPUSets are used I applied these change to bcolz:

--- bcolz/toplevel.py.orig      2021-03-19 16:32:12.262185884 +0100
+++ bcolz/toplevel.py   2021-03-19 16:28:57.402327136 +0100
@@ -63,6 +63,9 @@

     """
     # Linux, Unix and MacOS:
+    if hasattr(os, "sched_getaffinity"):
+        ncpus = len(os.sched_getaffinity(0))
+        return ncpus
     if hasattr(os, "sysconf"):
         if "SC_NPROCESSORS_ONLN" in os.sysconf_names:
             # Linux & Unix:

Now allel can be imported without getting an error.