MaartenGr / BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.
https://maartengr.github.io/BERTopic/
MIT License
6.1k stars 761 forks source link

AttributeError: 'BERTopic' object has no attribute 'diversity' #411

Closed SankulR closed 2 years ago

SankulR commented 2 years ago

I am getting this error when trying to fit the updated list of documents to the pre-trained model Can you please help me with the resolution for this error :)

/usr/local/lib/python3.7/dist-packages/bertopic/_bertopic.py in _extract_words_per_topic(self, words, c_tf_idf, labels) 1650 # Extract word embeddings for the top 30 words per topic and compare it 1651 # with the topic embedding to keep only the words most similar to the topic embedding -> 1652 if self.diversity is not None: 1653 if self.embedding_model is not None: 1654

AttributeError: 'BERTopic' object has no attribute 'diversity'

MaartenGr commented 2 years ago

Hmmm, that is a strange error! Could you share the entire code for training and updating the list of documents? Also, which BERTopic version are you currently using?

SankulR commented 2 years ago

Hello Maarten! here is the list of documents : temp_list=["When working as a team, we can make a big impact for those who need it the most. ", "As many of us were sidelined by COVID-19, we saw the real heroes jump to action our essential healthcare workers on the frontline. ", "Find out how Airbus is supporting airlines and cargo carriers in the unprecedented logistical challenge posed by the transportation of COVID19 vaccines - at a time when aviation is set to play a major role in this life saving mission.", "2020 may have been challenging, but it didn't stop us from supporting efforts against COVID19, whilst making flying a safe experience for all. We'll be releasing two of our highlights every weekend of December, inviting you to reflect on what each one means to us. Airbus2020", "Thanks to 3D-printed parts, Airbus Humanity Lab is helping to transform Easybreath masks into protective gear for healthcare professionals and ventilation masks for patients during the covid19 pandemic. Read more:", "At Airbus, we are doing everything we can to keep our employees safe while we keep fighting against the COVID19 pandemic.Watch the message from Airbus CEO .", "Digitization is already changing the economy immensely now the corona pandemic is forcing many companies to move even faster. Digitization expert Prof. Irene Bertschek explains which digital solutions will be long-lasting. WeNotMe", "The pandemic should above all be a wakeup call that our wellbeing is closely tied to the health of the planet, writes , Spokesperson of our Sustainability Council, . WeNotMe", "RT : Coronavirus pandemic is challenging societies worldwide, putting health into focus for all of us. Thus, we are in close", "Infectious Disease specialist Dr. Aronoff-Spencer and other experts from the fields of pulmonology and radiology share their approach to the COVID19 pandemic. Watch the free webinar.", "How does COVID19 affect the future of radiology? And how can the latest innovations help to deal with the impact of the pandemic? ECR2020", "Why is it that some supplychains didn t fail as badly as others during the pandemic? Find out here:", "ASCO2020 While we serve the world s needs for testing tools in the COVID-19 pandemic, QIAGEN also continues to deliver cutting-edge molecular solutions for cancer research and improving patient outcomes.Thierry Bernard, CEO at QIAGEN", "RT : Thank you for highlighting the coronavirus test-making facility in Germantown, MD. We appreciate the sacrifices", "Webinar on SARS-CoV-2 research: Hear from expert virologists about their coronavirus research using QIAGEN Digital Insights. Sign up now:", "A message from our production team in Hilden, Germany: We are here for you! Wir sind f r euch da! wirgegenCorona Please be there for them. StayHome and FlattenTheCurve. FightCOVID19 StrongerTogether", "COVID19 coronavirus We are seeking fast FDA approval for our panel: Our objective was to bring the test as soon as possible to the US hospitals, to the US potential patients, because, we are in an emergency. Thierry Bernard, Interim CEO at QIAGEN"]

Also here is the code for training: topic_model = BERTopic.load("/content/drive/MyDrive/2021-11-30-ReducedTopicModel") topics, probs = topic_model.fit(temp_list)

Lastly I am. using BERTopic version 0.9.4

MaartenGr commented 2 years ago

Ah, it seems that you are trying to update a fitted model with new documents. This is not supported in BERTopic but if you are interested in finding the topics and probabilities temp_list you can use .transform(temp_list) instead. .fit trains a BERTopic model, regardless of whether it was pre-trained, entirely from scratch.