lluisgomez / TextTopicNet

Self-supervised learning of visual features through embedding images into text topic spaces
95 stars 28 forks source link

LDA model #4

Closed ybh16 closed 2 years ago

ybh16 commented 6 years ago

When I trained the LDA model, I found that the model was larger than the website', and when I used them in multi_modal_retrieval.py, it occurs" IndexError: index 116494 is out of bounds for axis 1 with size 100000", why?

yash0307 commented 6 years ago

Can you provide more specific details as to which line of code this error is occurring on ?

My guess is that you are using a different dictionary for obtaining BoW representations for text documents. Usual idea is to ignore very infrequent words (which is what we did while generating the dictionary).

irazakharchenko commented 6 years ago

Hi! I have the same error it's this line: lda_vector = ldamodel.get_document_topics(bow_vector, minimum_probability=None)

yash0307 commented 6 years ago

@irazakharchenko, can you confirm that the dictionary of words provided in the repository was used for generating the bag of words vector (bow_vector) ?

irazakharchenko commented 6 years ago

@yash0307, no, I created it myself using generate_train_dict.py.