Closed srujana-tak closed 5 years ago
Hi Srujana,
Would you be able to clarify what your question? Are you anchoring words to your topic model? And what words and results do you want to see results for?
I am using anchor words like this topic_model.fit(X, words=words, anchors=[['dog','cat','animal'], ['home','interior', 'furniture'], ['beauty', 'cosmetic'], anchor_strength=3)]
If I want to fit this model (with same parameters) for a new data (X) and if it doesn't have a word 'cosmetic' then I am getting error 'Anchor word not in word column labels provided to CorEx:' I want same topics for new dataset but it is hard to change anchor words every time I fit the model
Is it that anchor words must be present in the data we provide?
Yes, right now the code is structured such that the anchor words must be present in the data that is provided. This is to help alert the user that the words they are trying to anchor cannot be anchored.
@gregversteeg, do you think this should raise a warning instead? The error that's thrown is here, in preprocessing the anchors.
Yes, a warning would do the job and still produce the results with words that can be anchored. Thanks!!
Thanks for your patience @srujana-tak. I've made the update so that CorEx throws a warning instead of an error if the anchor is not in the vocabulary. Let me know if you have any further issues.
I've also updated CorEx on pip, so if you installed it via pip then you should be able to update it using pip install corextopic --upgrade
Thank you!
How to skip if anchor words not in topic and still produce results for those words available