tensorflow / skflow

Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning
Apache License 2.0
3.18k stars 439 forks source link

Get probabilities for ALL the classes #95

Closed michael4john closed 8 years ago

michael4john commented 8 years ago

I'm looking at the text_classification.py.

classifier.predict(X-test) gets the class number with the highest probability. But I wonder how to get the probabilities for all the classes per input.

Thanks in advance!

terrytangyuan commented 8 years ago

You can use predict_proba instead of predict to get class probabilities.

Thanks.

On Mon, Feb 8, 2016 at 4:33 PM, michael4john notifications@github.com wrote:

I'm looking at the text_classificationpy

classifierpredict(X-test) gets the class number with the highest probability But I wonder how to get the probabilities for all the classes per input

Thanks in advance!

— Reply to this email directly or view it on GitHub https://github.com/tensorflow/skflow/issues/95.

michael4john commented 8 years ago

Thanks for the prompt response, @terrytangyuan . predict_proba does work, but it returns me 15 probabilities, which I think has something to do with the n_classes argument of the TensorFlowEstimator function. I changed it to 5 because I only have 5 categories to predict, but the following error message is returned:

Traceback (most recent call last): File "search_classification.py", line 88, in classifier.fit(X_train, y_train, logdir='/tmp/tf_examples/word_rnn') File "/home/fz1662/skflow/estimators/base.py", line 214, in fit feed_params_fn=self._data_feeder.get_feed_params) File "/home/fz1662/skflow/trainer.py", line 143, in train feed_dict = feed_dict_fn() File "/home/fz1662/skflow/io/data_feeder.py", line 212, in _feed_dict_fn out.itemset((i, self.y[sample]), 1.0) IndexError: index 5 is out of bounds for axis 1 with size 5

ilblackdragon commented 8 years ago

@michael4john If you are only have 5 classes then this errors indicates that you are using 1-based identification for them. You should make sure your classes are all indexes from 0 to n_classes - 1.

Note, you can always use some of skflow.preprocessing.CategoricalProcessor functionality to remap output classes into 0 to 4.

michael4john commented 8 years ago

@ilblackdragon I reindexed my dataset from 0 to 4. And it works! Thank you so much!