This fixes a large performance regression with many of the CPU models.
For instance when training the ml100k dataset, when compiled with Cython 3.0.0:
Without this change
DEBUG:implicit:trained model 'bpr' in 69.49404716491699
DEBUG:implicit:trained model 'lmf' in 42.24160981178284
DEBUG:implicit:trained model 'als' in 120.11413407325745
With this change
DEBUG:implicit:trained model 'bpr' in 0.18895816802978516
DEBUG:implicit:trained model 'lmf' in 0.06523633003234863
DEBUG:implicit:trained model 'als' in 1.8809657096862793
This fixes a large performance regression with many of the CPU models.
For instance when training the ml100k dataset, when compiled with Cython 3.0.0:
Without this change
With this change
Fixes #678