Open FranValero97 opened 1 year ago
Which version of BERTopic are you currently using? Also, did you try installing BERTopic from the main branch? I believe there was a fix a while back for this. Also, although spaCy is supported as an embedding model, it is not something I would generally recommend. The models here are generally advised.
Good morning, this is my code obtained from the following page: https://spacy.io/universe/project/bertopic after running it I get the following error: Can't retrieve unregistered extension attribute 'trf_data'. Did you forget to call the set_extension method?
How can I solve this error?
Instalación de las bibliotecas necesarias !pip install spacy !pip install bertopic !pip install scikit-learn
Descargar el modelo de spaCy en inglés (medium) !python -m spacy download en_core_web_md
Cargar las bibliotecas y el modelo import spacy from bertopic import BERTopic from sklearn.datasets import fetch_20newsgroups
Cargar los documentos de la base de datos de 20 Newsgroups docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']
Cargar el modelo de spaCy en inglés (medium) excluyendo componentes innecesarios nlp = spacy.load('en_core_web_md', exclude=['tagger', 'parser', 'ner', 'attribute_ruler', 'lemmatizer'])
Crear el modelo BERTopic con spaCy topic_model = BERTopic(embedding_model=nlp) topics, probs = topic_model.fit_transform(docs)
I have tried changing the version of spacy to one that is between version 3.3.0 and version 3.4.0, I still get the same error trying all of them spacy models (sm, md, lg, trf)