benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets
https://benfred.github.io/implicit/
MIT License
3.57k stars 612 forks source link

SegmentationFault in AlternatingLeastSquares on model.fit() on newest 0.70.0 version #672

Closed svagier closed 1 year ago

svagier commented 1 year ago

Hi, this is my simplified code:

    from implicit.cpu.als import AlternatingLeastSquares

    regularization = 0.01
    iterations = 50
    confidence = 40
    model = AlternatingLeastSquares(factors=factors,
                                    regularization=regularization,
                                    dtype=np.float32,
                                    iterations=iterations)
    print('Model created. Starting model.fit()')
    model.fit(confidence * products_per_customer_csr)
    print('Model fitted.')

I am getting segfault error during model.fit():

Model created. Starting model.fit() 0%| | 0/50 [00:00<?, ?it/s]Segmentation fault (core dumped)

This is all I see in the logs. I am running it on AWS CodeBuild with Linux, 145 GB memory, 72 vCPUs. There are 130 000 customers in the input to the model - I believe that with these specs it should work.

I read in this PR #662 that this error happened in previous versions of implicit library - I updated to 0.70.0 and it still happens. Any help appreciated

benfred commented 1 year ago

@svagier Can you get the stack trace here? It's a bit hard to say whats happening here without it

gdb --args python your_script_name.py

# type 'run' in gdb to execute the script
# once it segfaults, go `bt` to get backtrace
benfred commented 1 year ago

@svagier please re-open if this is still a problem for you