Segmentation fault (fresh install)

Hi all, I'm not sure how to debug this. I have a fresh install of Anaconda with Python 3.9.13:

Python 3.9.13 (main, Aug 25 2022, 18:29:29) [Clang 12.0.0 ] :: Anaconda, Inc. on darwin

When attempting to initialize a KeyBERT model I get a segmentation fault error. I'm not sure what the next steps would be to debug this. I'm well-versed with R, but still new to Python so I don't have a toolbox of debug steps. Any thoughts about where to look first?

>>> from keybert import KeyBERT
>>> kw_model = KeyBERT()
zsh: segmentation fault  python

I've run it in a Jupyter notebook with the same results. The first time I used it, the models were downloaded (I expect succesfully?). But when attempting to initialize the model, I get this segfault.

Edited to add, using the faulthandler output

(base) user@place ~ % python -q -X faulthandler
>>> from keybert import KeyBERT
>>> kw_model = KeyBERT()
Fatal Python error: Fatal Python error: Segmentation fault

Segmentation faultThread 0x

00000002054742c0 (most recent call first):
  File "/Users/user/opt/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1568 in _load_from_state_dict
  File "/Users/user/opt/anaconda3/lib/python3.9/site-packages/transformers/modeling_utils.py", line 469 in load
  File "/Users/user/opt/anaconda3/lib/python3.9/site-packages/transformers/modeling_utils.py", line 473 in load
  File "/Users/user/opt/anaconda3/lib/python3.9/site-packages/transformers/modeling_utils.py", line 473 in load
  File "/Users/user/opt/anaconda3/lib/python3.9/site-packages/transformers/modeling_utils.py", line 475 in _load_state_dict_into_model
  File "/Users/user/opt/anaconda3/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2517 in _load_pretrained_model
  File "/Users/user/opt/anaconda3/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2326 in from_pretrained
zsh: segmentation fault  python -q -X faulthandler

I solved my issue. I reinstalled PyTorch using conda then in my script I loaded PyTorch first beforew loading KeyBERT.

>>> import torch
>>> from keybert import KeyBERT
>>> kw_model = KeyBERT()
>>> kw_model
<keybert._model.KeyBERT object at 0x7ff358210d00>
>>> doc = """
... Supervised learning is the machine learning task of learning a function that
...          maps an input to an output based on example input-output pairs. It infers a
...          function from labeled training data consisting of a set of training examples.
...          In supervised learning, each example is a pair consisting of an input object
...          (typically a vector) and a desired output value (also called the supervisory signal).
...          A supervised learning algorithm analyzes the training data and produces an inferred function,
...          which can be used for mapping new examples. An optimal scenario will allow for the
...          algorithm to correctly determine the class labels for unseen instances. This requires
...          the learning algorithm to generalize from the training data to unseen situations in a
...          'reasonable' way (see inductive bias).
... """
>>> keywords = kw_model.extract_keywords(doc)
>>> keywords
[('supervised', 0.6676), ('labeled', 0.4896), ('learning', 0.4813), ('training', 0.4134), ('labels', 0.3947)]

MaartenGr / KeyBERT

Segmentation fault (fresh install) #146