gregversteeg / corex_topic

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx
Apache License 2.0
626 stars 119 forks source link

Anchoring error: 'numpy.ndarray' object has no attribute 'A1' #25

Closed toth12 closed 5 years ago

toth12 commented 5 years ago

topic_model = ct.Corex(n_hidden=33, max_iter=200, verbose=False, seed=2) topic_model.fit(doc_word, words=features_list,anchors = [['pope','king']],anchor_strength = 3)

Output: *** AttributeError: 'numpy.ndarray' object has no attribute 'A1'

The very same model and data without anchoring works fine

ryanjgallagher commented 5 years ago

@gregversteeg Can you take a look at this? It gets to this line of code because anchors are used, but I don't see how they could be causing the issue.

gregversteeg commented 5 years ago

That's very subtle! The problem is that X is being passed as a numpy array. However, X has to be either a scipy sparse array OR a numpy matrix. A numpy matrix is different from a numpy array. Most notably, a numpy matrix as an A1 attribute to flatten it. If you cast your input as "X = np.matrix(X)" it should work.

It'd be good to have a way around this inconvenience, but I can't think of one that isn't complicated.

toth12 commented 5 years ago

@gregversteeg @ryanjgallagher thanks, it did resolve the problem!