Can't limit the number of CPUs, NUM_THREADS doesn't help

Hi there!

I've build a job that calculates recommendations (AlternatingLeastSquares, CPU version) On my host machine it takes ~8 minutes to finish. In my K8S cluster it takes ~100 minutes to finish. The resources are comparable. In both cases native extensions are used.

The only problem I can see is that the job consumes all available cores in K8S. And here comes throttling (cpu_limits=20).

How I can restrict the job to a specific number of cores? Do I have to play with OpenMP settings or something? Setting NUM_THREADS didn't work out, the job still consumes all available cores.

I believe that's the reason of such enormous degradation.

The line that is taking so long is this:

model.recommend(user_id, user_item_data[user_id], N=10)

There are a few million users

benfred / implicit

Can't limit the number of CPUs, NUM_THREADS doesn't help #580