benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets
https://benfred.github.io/implicit/
MIT License
3.57k stars 612 forks source link

Can't limit the number of CPUs, NUM_THREADS doesn't help #580

Closed zakirullin closed 2 years ago

zakirullin commented 2 years ago

Hi there!

I've build a job that calculates recommendations (AlternatingLeastSquares, CPU version) On my host machine it takes ~8 minutes to finish. In my K8S cluster it takes ~100 minutes to finish. The resources are comparable. In both cases native extensions are used.

The only problem I can see is that the job consumes all available cores in K8S. And here comes throttling (cpu_limits=20).

How I can restrict the job to a specific number of cores? Do I have to play with OpenMP settings or something? Setting NUM_THREADS didn't work out, the job still consumes all available cores.

I believe that's the reason of such enormous degradation.

The line that is taking so long is this:

model.recommend(user_id, user_item_data[user_id], N=10)

There are a few million users

zakirullin commented 2 years ago

OMP_NUM_THREADS did help 🦆