bmabey / pyLDAvis

Python library for interactive topic model visualization. Port of the R LDAvis package.
BSD 3-Clause "New" or "Revised" License
1.81k stars 363 forks source link

Support for gensim's AuthorTopic model #112

Open tmthyjames opened 6 years ago

tmthyjames commented 6 years ago

Currently when I try to run LDAviz on gensim's AuthorTopic mode I get the following error

---------------------------------------------------------------------
TypeError                           Traceback (most recent call last)
<ipython-input-74-ac4a18d60a1e> in <module>()
----> 1 data = pyLDAvis.gensim.prepare(model, corpus, dictionary)
      2 data

/home/ubuntu/anaconda3/lib/python3.6/site-packages/pyLDAvis/gensim.py in prepare(topic_model, corpus, dictionary, doc_topic_dist, **kwargs)
    109     See `pyLDAvis.prepare` for **kwargs.
    110     """
--> 111     opts = fp.merge(_extract_data(topic_model, corpus, dictionary, doc_topic_dist), kwargs)
    112     return vis_prepare(**opts)

/home/ubuntu/anaconda3/lib/python3.6/site-packages/pyLDAvis/gensim.py in _extract_data(topic_model, corpus, dictionary, doc_topic_dists)
     40           gamma = topic_model.inference(corpus)
     41       else:
---> 42           gamma, _ = topic_model.inference(corpus)
     43       doc_topic_dists = gamma / gamma.sum(axis=1)[:, None]
     44 

TypeError: inference() missing 3 required positional arguments: 'author2doc', 'doc2author', and 'rhot'
tmthyjames commented 6 years ago

In pyLDAvis.gensim, replacing line 42

gamma, _ = topic_model.inference(corpus)

with

doc2author = atmodel.construct_doc2author(topic_model.corpus, topic_model.author2doc)
gamma, _ = topic_model.inference(topic_model.corpus, topic_model.author2doc, doc2author, 0)

seems to work.

Integrating the new Author/Document tags into the data structure is the next step.

Will submit a PR when finished.

clopezno commented 5 years ago

Thank you for your code. I tested suggested change and it works adding the following import code line

 import gensim.models.atmodel as atmodel
 doc2author = atmodel.construct_doc2author(topic_model.corpus, topic_model.author2doc)
 gamma, _ = topic_model.inference(topic_model.corpus, topic_model.author2doc, doc2author, 0)
menyalas commented 3 years ago

I am facing this same issue and would like to try the solution described by @clopezno and @tmthyjames but I am not sure how to modify the pyLDAvis.gensim code. Specifically, it appears to be the extract_data function I need to alter to work with my author-topic model. Here is my error code:

TypeError Traceback (most recent call last)

in 1 # Visualize the topics 2 pyLDAvis.enable_notebook() ----> 3 vis = pyLDAvis.gensim.prepare(ATModel_24, corpus, id2word) 4 vis /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pyLDAvis/gensim.py in prepare(topic_model, corpus, dictionary, doc_topic_dist, **kwargs) 116 See `pyLDAvis.prepare` for **kwargs. 117 """ --> 118 opts = fp.merge(_extract_data(topic_model, corpus, dictionary, doc_topic_dist), kwargs) 119 return vis_prepare(**opts) /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pyLDAvis/gensim.py in _extract_data(topic_model, corpus, dictionary, doc_topic_dists) 46 gamma = topic_model.inference(corpus) 47 else: ---> 48 gamma, _ = topic_model.inference(corpus) 49 doc_topic_dists = gamma / gamma.sum(axis=1)[:, None] 50 else: TypeError: inference() missing 3 required positional arguments: 'author2doc', 'doc2author', and 'rhot' I've tried redefining the function with the modified code but that doesn't seem to work.
jonaschn commented 3 years ago

@menyalas Add the lines mentioned in my PR #161 or wait until my PR is merged

msusol commented 3 years ago

waiting on the conflict resolution for PR #161 before can make the merge.