benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets
https://benfred.github.io/implicit/
MIT License
3.57k stars 612 forks source link

BLAS : program is terminated. because you tried to allocate too many memory regions. #689

Closed zhuqunyan closed 1 year ago

zhuqunyan commented 1 year ago

I try to set os.environ['OPENBLAS_NUM_THREADS'] = '1' os.environ['GOTO_NUM_THREADS'] = '1' os.environ['OMP_NUM_THREADS'] = '1' but, the error info still occur

then This is how I fix the error:

from threadpoolctl import threadpool_info from threadpoolctl import threadpool_limits pprint(threadpool_info()) threadpool_limits(limits=1, user_api='blas') threadpool_limits(limits=1, user_api='openmp')

zhuqunyan commented 1 year ago

os.environ['OPENBLAS_NUM_THREADS'] = '1' os.environ['GOTO_NUM_THREADS'] = '1' os.environ['OMP_NUM_THREADS'] = '1'

print('before threadpool_limits:') pprint(threadpool_info())

threadpool_limits(limits=1, user_api='blas') threadpool_limits(limits=1, user_api='openmp') print('after threadpool_limits:') pprint(threadpool_info())

The following are The stdout:

before threadpool_limits: [{'architecture': 'Haswell', 'filepath': '/home/pai/lib/python3.6/site-packages/numpy/.libs/libopenblasp-r0-34a18dc3.3.7.so', 'internal_api': 'openblas', 'num_threads': 64, 'prefix': 'libopenblas', 'threading_layer': 'pthreads', 'user_api': 'blas', 'version': '0.3.7'}, {'filepath': '/home/pai/lib/libgomp.so.1.0.0', 'internal_api': 'openmp', 'num_threads': 104, 'prefix': 'libgomp', 'user_api': 'openmp', 'version': None}, {'architecture': 'Haswell', 'filepath': '/home/pai/lib/python3.6/site-packages/scipy.libs/libopenblasp-r0-085ca80a.3.9.so', 'internal_api': 'openblas', 'num_threads': 64, 'prefix': 'libopenblas', 'threading_layer': 'pthreads', 'user_api': 'blas', 'version': '0.3.9'}] after threadpool_limits: [{'architecture': 'Haswell', 'filepath': '/home/pai/lib/python3.6/site-packages/numpy/.libs/libopenblasp-r0-34a18dc3.3.7.so', 'internal_api': 'openblas', 'num_threads': 1, 'prefix': 'libopenblas', 'threading_layer': 'pthreads', 'user_api': 'blas', 'version': '0.3.7'}, {'filepath': '/home/pai/lib/libgomp.so.1.0.0', 'internal_api': 'openmp', 'num_threads': 1, 'prefix': 'libgomp', 'user_api': 'openmp', 'version': None}, {'architecture': 'Haswell', 'filepath': '/home/pai/lib/python3.6/site-packages/scipy.libs/libopenblasp-r0-085ca80a.3.9.so', 'internal_api': 'openblas', 'num_threads': 1, 'prefix': 'libopenblas', 'threading_layer': 'pthreads', 'user_api': 'blas', 'version': '0.3.9'}]

benfred commented 1 year ago

Going os.environ['OPENBLAS_NUM_THREADS'] = '1' won't work if you've imported any code that uses numpy before setting this (the environment variable needs to be set when importing the blas library, not before trying to run). Can you try setting as an environment variable before running to see if this also fixes ?

The threadpoolctl library looks neat! thanks for sharing.

However, implicit uses openmp for multithreading - and going threadpool_limits(limits=1, user_api='openmp') might limit implicit to running on a single thread. Setting the blas limits with threadpoolctl should also work though,

benfred commented 1 year ago

I've switched over to using threadpoolctl (instead of relying on the environment variables in #692) to detect threadpool usage in blas.

let me know if you're still having any problems here,