snorkel-team / snorkel

A system for quickly generating training data with weak supervision
https://snorkel.org
Apache License 2.0
5.79k stars 859 forks source link

Categorical_Classes Notebook Error #1110

Closed YipingNUS closed 5 years ago

YipingNUS commented 5 years ago

Hi I'm trying the tutorial notebook for categorical classification and got stuck training the generative classifier.

from snorkel.learning import GenerativeModel

gen_model = GenerativeModel()
gen_model.train(L_train, cardinality=3)
train_marginals = gen_model.marginals(L_train.todense())

Seems like gen_model .train() requires dense matrix. If I input a sparse matrix as produced by the code in the notebook, it'll complain

~/miniconda2/envs/snorkel/lib/python3.6/site-packages/scipy/sparse/base.py in __getattr__(self, attr)
    686             return self.getnnz()
    687         else:
--> 688             raise AttributeError(attr + " not found")
    689 
    690     def transpose(self, axes=None, copy=False):

AttributeError: _unpack_index not found

However, the line train_marginals = gen_model.marginals(L_train.todense()) works neither with sparse nore dense matrix. When inputing the dense matrix, it throws the following error instead:

~/miniconda2/envs/snorkel/lib/python3.6/site-packages/snorkel/learning/gen_learning.py in marginals(self, L, candidate_ranges, batch_size)
    446                 marginals = np.zeros(cardinality, dtype=np.float64)
    447                 # NB: class priors not currently available for categoricals
--> 448                 l_i = L[i].tocoo()
    449                 for l_index1 in range(l_i.nnz):
    450                     data_j, j = l_i.data[l_index1], l_i.col[l_index1]

AttributeError: 'matrix' object has no attribute 'tocoo'
bhancock8 commented 5 years ago

This looks related to #1106 and got fixed in #1111. Sorry for the hassle; try again and it should work now!