ContinuumIO / topik

A Topic Modeling toolbox
BSD 3-Clause "New" or "Revised" License
92 stars 24 forks source link

pyLDAvis ValidationError: Not all rows (distributions) in doc_topic_dists sum to 1 #80

Open imranshaikmuma opened 7 years ago

imranshaikmuma commented 7 years ago

i am getting the below error when trying to visualize HDP model trained on gensim

**_--------------------------------------------------------------------------- ValidationError Traceback (most recent call last)

in () ----> 1 vis_data_hdp = gensimvis.prepare(hdpmodel, corpus, dictionary) 2 #pyLDAvis.display(vis_data_hdp) C:\Anaconda2\lib\site-packages\pyLDAvis\gensim.pyc in prepare(topic_model, corpus, dictionary, doc_topic_dist, **kwargs) 110 """ 111 opts = fp.merge(_extract_data(topic_model, corpus, dictionary, doc_topic_dist), kwargs) --> 112 return vis_prepare(**opts) C:\Anaconda2\lib\site-packages\pyLDAvis\_prepare.pyc in prepare(topic_term_dists, doc_topic_dists, doc_lengths, vocab, term_frequency, R, lambda_step, mds, n_jobs, plot_opts, sort_topics) 372 doc_lengths = _series_with_name(doc_lengths, 'doc_length') 373 vocab = _series_with_name(vocab, 'vocab') --> 374 _input_validate(topic_term_dists, doc_topic_dists, doc_lengths, vocab, term_frequency) 375 R = min(R, len(vocab)) 376 C:\Anaconda2\lib\site-packages\pyLDAvis\_prepare.pyc in _input_validate(*args) 63 res = _input_check(*args) 64 if res: ---> 65 raise ValidationError('\n' + '\n'.join([' * ' + s for s in res])) 66 67 ValidationError: * Not all rows (distributions) in doc_topic_dists sum to 1._** To train hdp model i have used the following syntax: hdpmodel = models.hdpmodel.HdpModel(corpus, dictionary) corpus looks like this: [[(0, 1), (1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 2), (12, 1), (13, 2), (14, 1), (15, 1), (16, 1), (17, 1), (18, 1), (19, 1), (20, 1), (21, 1), (22, 1), (23, 1), (24, 1), (25, 1), (26, 1), (27, 1), (28, 1), (29, 1), (30, 1), (31, 1), (32, 1), (33, 4), (34, 1), (35, 1), (36, 1), (37, 1), (38, 1), (39, 2), (40, 1), (41, 2), (42, 1), (43, 2), (44, 1), (45, 1), (46, 1), (47, 3), (48, 1), (49, 1), (50, 2), (51, 1), (52, 1), (53, 1), (54, 1), (55, 1), (56, 1), (57, 1), (58, 1), (59, 1), (60, 1), (61, 1), (62, 1), (63, 1), (64, 1), (65, 1)] dictionary looks like this: [u'', u'dacteur', u'reallocations', u'advcompliance', u'resolveboth............
ned2 commented 6 years ago

I can confirm this also.

a087861 commented 6 years ago

double confirmed