libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.79k stars 180 forks source link

Default num_workers is incompatible with SLURM #357

Closed aldakata closed 1 month ago

aldakata commented 7 months ago

I encountered a bug in the code. If I run

Loader(path,
            batch_size=1,
            order=OrderOption.SEQUENTIAL,
            pipelines=PIPELINES)

I get

ValueError: The number of threads must be between 1 and 2

Why? After this request, numba.config.NUMBA_DEFAULT_NUM_THREADS The number of usable CPU cores on the system (as determined by len(os.sched_getaffinity(0)), if supported by the OS, or multiprocessing.cpu_count() if not). This is the default value for numba.config.NUMBA_NUM_THREADS unless the NUMBA_NUM_THREADS environment variable is set.

BUT, here the default number of workers is set as multiprocessing.cpu_count(). In the HPC I am working on multiprocessing.cpu_count() = 64 but len(os.sched_getaffinity(0)=2.