dennybritz / cnn-text-classification-tf

Convolutional Neural Network for Text Classification in Tensorflow
Apache License 2.0
5.64k stars 2.77k forks source link

how to test the trained module? #8

Closed haridatascientist closed 8 years ago

haridatascientist commented 8 years ago

i trained the movie review training set using this code. i got trained files in the path "runs/1458022294/summaries/train". how can i test the module is there any API in python to test it?

areebak commented 8 years ago

@JamesRedfield / @AAMIBhavya /@dennybritz : do you have sample code for loading your own single sentence to detect sentiment? I'm running this for a class and it would be a huge help to see it. Thanks.

aigujin commented 8 years ago

Hi, thanks for the code. Just one question: how to modify (if possible at all) eval.py to get a precision/recall table?

dennybritz commented 8 years ago

@aigujin Easiest way is probably to use http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html with the predictions.

aigujin commented 8 years ago

@dennybritz Hi, thanks for the suggestion; it worked. Now, I would like to clarify a couple of issues:

  1. If I have out-of-sample sentences to predict classes, I ignore the variable y_test, right? (I don't know the classes and want a model to classify)
  2. In the same set up (out-of-sample), do I need to build a new vocabulary or I can use the one from the model?
dennybritz commented 8 years ago

@aigujin

  1. Yes
  2. You must to use the same vocabulary you used during training because of the learned word mappings. You don't build a new one. If your examples are very different types of sentences than those used in training it may be better to re-train.
fmaglia commented 8 years ago

Hi @dennybritz I have a question for you! I modify the code for training and evaluation the same data in every simulation because I would like to analyze the changing of the accuracy in the classification of the dev set. I also modified the eval.py, but the result of the eval script is different to the result of the dev accuracy in the train script. In "eval.py" the result about accuracy is better than "train.py". I didn't modified the tensorflow code, but the only the operations in the data_helpers.py and the inizialization of the data in train.py and the eval.py. Have you any ideas about this situation?

dennybritz commented 8 years ago

@fmaglia The eval.py file loads the training data by default. It's expected to get better accuracy on the training data than on dev/test data (obviously, since you're training on it). If you have a test set you should load that in eval.py.

fmaglia commented 8 years ago

I splitted the initial dataset in training set (67% of the dataset) and in dev/test data (33% of the dataset). In train.py I used the training set for the training and the test set for the classification, instead in eval.py I used only the test set for the evaluation. The data set used are the same but not the results obtained on the evaluation/classification of the dev/test set!

dennybritz commented 8 years ago

That's very strange, it should be the same if you're using the same test set.

fmaglia commented 8 years ago

I found now the error. Thanks so much!

aigujin commented 8 years ago

@fmaglia Hi, how did you solve the error in this comment. It seems, I am getting the same. Thanks

fmaglia commented 8 years ago

@aigujin In that case I was trying to evaluate a test set that are composed by the words, that aren't all in the vocabulary. I solved it by changing the data_helpers.py. I inserted the test data in the training set and I split the dataset for training and a part for testing. But I encountered a problem: the data_helpers script inserted positive sentences and negative sentences. The latest sentences (used for test set) are all negative!

I solved with this code (in data_helpers.py): ` # Generate labels positivelabels = [[0, 1] for in positive_examples] negativelabels = [[1, 0] for in negative_examples] labels = [] totLabels = len(positive_examples) + len(negative_examples) for j in range(totLabels): if j%2==0: labels.append([0,1]) else: labels.append([1,0])

#ordering the dataset
x_text = []
totale = len(positive_examples) + len(negative_examples)
for i in range(totale):
    if i%2==0:
        x_text.append(positive_examples.pop(0))
    else:
        x_text.append(negative_examples.pop(0))`

P.S. my dataset is balanced.

rajmiglani commented 8 years ago

Hey, I want to classify Q & A with 45 classes .Can i use this model ? I have already pre-processed the data into one-hot encoding (45 size vector) as was done in this post.What else do i have to change in the code? I trained without any errors.My vocab size is only 550 words. Also how to evaluate a single Q's class?I have also made the changes for the batch-size and epoch parameters as my data-set is pretty small consisting of only 356 Q with 45 classes. I tried to run the eval.py changing the data file(in data_helpers.py) to a file with only one Q print(all_predictions) but i am getting certain errors similar to @fmaglia .Can someone be more specific how to proceed? Also what train/dev should i keep.at present its 346/10( i know its a small data-set but want to check if it works) Kindly reply as soon as possible.@dennybritz @JamesRedfield

two errors:

1. InvalidArgumentError Traceback (most recent call last)

in () 52 53 for x_test_batch in batches: ---> 54 batch_predictions = sess.run(predictions, {input_x: x_test_batch, dropout_keep_prob: 1.0}) 55 all_predictions = np.concatenate([all_predictions, batch_predictions]) 56 /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict) 313 `Tensor` that doesn't exist. 314 """ --> 315 return self._run(None, fetches, feed_dict) 316 317 def partial_run(self, handle, fetches, feed_dict=None): /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict) 509 # Run request and get response. 510 results = self._do_run(handle, target_list, unique_fetches, --> 511 feed_dict_string) 512 513 # User may have fetched the same tensor multiple times, but we /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict) 562 if handle is None: 563 return self._do_call(_run_fn, self._session, feed_dict, fetch_list, --> 564 target_list) 565 else: 566 return self._do_call(_prun_fn, self._session, handle, feed_dict, /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args) 584 # pylint: disable=protected-access 585 raise errors._make_specific_exception(node_def, op, error_message, --> 586 e.code) 587 # pylint: enable=protected-access 588 six.reraise(e_type, e_value, e_traceback) and second same as @fmaglia
dennybritz commented 8 years ago

want to classify Q & A with 45 classes .Can i use this model ?

Yes, it should be the same.

I can't tell much based on the error you posted. It may help to post properly formatted code/errors and a diff with your changes. Change the num_classes argument when instantiating the TextCNN during training (see train.py) and make sure your data preprocessing is right. Make sure training works first, not eval.

rajmiglani commented 8 years ago

Hey, Training is working fine but the error occurs in eval.py. I am posting the data_helpers.py file(changed only the function that processes data) along with my data-set(part of it) here, kindly help me resolve the errors as soon as possible.The changes done on my part are just to change the file in data_helpers.py to a file with just one Q. Vocabulary size being returned while eval.py is just the no of word in the new test file(7 in my case but these words were present while training the model with my own dataset, vocab size being 508 ). Am i doing something wrong here? i have not done any change in eval.py but just changed the data_helpers.py to load my own data as mentioned. i don't get denny's comment : You must have the same vocabulary during training and test. In eval, you need to build the full vocabulary from the training data, and then map your test sentences in that vocabulary.

@fmaglia @dennybritz @JamesRedfield

errors:

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-25-e4c837cef39c> in <module>()
     56         for x_test_batch in batches:
     57             print()
---> 58             batch_predictions = sess.run(predictions, {input_x: x_test_batch, dropout_keep_prob: 1.0})
     59             all_predictions = np.concatenate([all_predictions, batch_predictions])
     60 

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict)
    313         `Tensor` that doesn't exist.
    314     """
--> 315     return self._run(None, fetches, feed_dict)
    316 
    317   def partial_run(self, handle, fetches, feed_dict=None):

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict)
    509     # Run request and get response.
    510     results = self._do_run(handle, target_list, unique_fetches,
--> 511                            feed_dict_string)
    512 
    513     # User may have fetched the same tensor multiple times, but we

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict)
    562     if handle is None:
    563       return self._do_call(_run_fn, self._session, feed_dict, fetch_list,
--> 564                            target_list)
    565     else:
    566       return self._do_call(_prun_fn, self._session, handle, feed_dict,

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args)
    584         # pylint: disable=protected-access
    585         raise errors._make_specific_exception(node_def, op, error_message,
--> 586                                               e.code)
    587         # pylint: enable=protected-access
    588       six.reraise(e_type, e_value, e_traceback)

InvalidArgumentError: computed output size would be negative
     [[Node: conv-maxpool-3/pool = MaxPool[ksize=[1, 21, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"](conv-maxpool-3/relu)]]
Caused by op u'conv-maxpool-3/pool', defined at:
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "/usr/local/lib/python2.7/dist-packages/traitlets/config/application.py", line 589, in launch_instance
    app.start()
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelapp.py", line 405, in start
    ioloop.IOLoop.instance().start()
  File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/ioloop.py", line 162, in start
    super(ZMQIOLoop, self).start()
  File "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line 883, in start
    handler_func(fd_obj, events)
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 260, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 212, in dispatch_shell
    handler(stream, idents, msg)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 370, in execute_request
    user_expressions, allow_stdin)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/ipkernel.py", line 175, in do_execute
    shell.run_cell(code, store_history=store_history, silent=silent)
  File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2723, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2825, in run_ast_nodes
    if self.run_code(code, result):
  File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-25-e4c837cef39c>", line 37, in <module>
    saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1285, in import_meta_graph
    return _import_meta_graph_def(_read_meta_graph_file(meta_graph_or_file))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1220, in _import_meta_graph_def
    importer.import_graph_def(meta_graph_def.graph_def, name="")
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py", line 238, in import_graph_def
    compute_shapes=False, compute_device=False)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2040, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1087, in __init__
    self._traceback = _extract_stack()

One more thing @dennybritz if i try to test it on another Q which has no word as in the trained vocablary , will it give me any errors while running or it will just reduce the accuracy?(right now i am just testing with the same words)

data_helpers.py function

def load_data_and_labels():
    """
    Loads  data from files, splits the data into words and generates labels.
    Returns split sentences and labels.i am able to get one-hot vectors with 45 classes as [1,0,0,0,0.....,0]
    """

    label=np.array([np.zeros(45)])
    lab=np.array([np.zeros(45)])

    #load data
    class_no=0

    x_texti=list(open("./data/file.txt","r").readlines())
    x_texti = [s.strip() for s in x_texti]
    length = len(x_texti)

    #generate labels
    lab[0][class_no]=1
    label=lab
    count=1
   # print(label)
    for i in range(1,length):

        if(x_texti[i]!="----"):
            label=np.concatenate((label,lab),0)

            count=count+1

            #print(i)
            #print(label)

        else:
            lab[0][class_no]=0
            class_no=class_no+1
            lab[0][class_no]=1
            continue

    #generate labels
    #label=[lab for _ in x_texti]
    #y=label
    #x_text=np.array([[x_texti[0]]])
    #split by words
    x_text=[]
    for sent in x_texti:
        if(sent!="----"):
            x_text=x_text+[clean_str(sent)]
    #split by words
    #x_text = [clean_str(sent) for sent in x_texti ]
    x_text = [s.split(" ") for s in x_text]

    return [x_text,label,count]

Data-set:

Does the Natural Language Classifier train using all of the data, or does it partition it in some way?
What kind of preprocessing does the classifier perform on its input?
Does the classifier train using all of the data, or does it partition it in some way?
does the classifier train on all the data or does it hold some out?
does nlclassifier perform random sampling to the training datasets?
----
Where can I find documentation on the Natural Language Classifier API?
Where can I find the API documentation?
What is the API for the Classifier?
Where can I find documentation on the NL classifier API?
Where can I find REST API documentation for the NL classifier?
Where can I view documentation for the classifier API?
How do I access the classifier API?

Here "----" implies new class.

Thanks in advance.

dennybritz commented 8 years ago

One more thing @dennybritz if i try to test it on another Q which has no word as in the trained vocablary , will it give me any errors while running or it will just reduce the accuracy?(right now i am just testing with the same words)

With the current code I believe it would throw an error. Here's how you usually deal with this:

  1. Decide on a number of words in your vocabulary, e.g. 10k.
  2. During data loading (in helpers.py), only consider the most common 10k words. Replace all other words with a special OOV (out of vocabulary) token.
  3. During training you'll learn a representation for your 10k words and the OOV "word".
  4. During testing do the same thing and replace all unknown words with the OOV token. So, that'll decrease your accuracy but won't throw an error.

If that's too much manual work, check out the VocabularyProcessor that comes with Tensorflow/skflow. It handles that for you. You can find an example here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/skflow/text_classification.py

rajmiglani commented 8 years ago

Hey,@dennybritz Thanks for taking so much pains.I finally have a working model. One last query how can we use word2vec model in this?

Thanks & Regards, Raj Miglani.

zhang-jian commented 8 years ago

To be more precise, since the function build_vocab() learns the vocabulary mapping based on the input data, both training/evaluation/test data should be indexed using the same vocabulary mapping when replacing the OOV. Otherwise, the index number of the same word could be different between training/testing.

Also, you need to make sure the max sentence length is the actual max sentence length in all training/evaluation/test data sets.

dennybritz commented 8 years ago

@rajmiglani See https://github.com/dennybritz/cnn-text-classification-tf/issues/10 and like zhang-jian said, make sure your vocabulary matches up.

areebak commented 8 years ago

PLEASE take me off this chain. Thanks.

On Monday, June 13, 2016, Denny Britz notifications@github.com wrote:

@rajmiglani https://github.com/rajmiglani See #10 https://github.com/dennybritz/cnn-text-classification-tf/issues/10 and like zhang-jian said, make sure your vocabulary matches up.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dennybritz/cnn-text-classification-tf/issues/8#issuecomment-225749224, or mute the thread https://github.com/notifications/unsubscribe/ALRpG44KMJRuCQXZRICRm7Iw1EKNoBNxks5qLfdsgaJpZM4HxCMG .

Areeba Kamal Mount Holyoke College '16

dennybritz commented 8 years ago

@areebak There's an unsubscribe button on the right side. I can't do anything about it.

rajmiglani commented 8 years ago

Hey, Sorry for the troubles guys but as my prev comment mentions i have solved the issues.

UTH commented 8 years ago

@zishell Could you share the full script to predict one sentence? I've tried to use the eval.py, but failed to predict correct. Many thanks. :)

kaushikpasi commented 8 years ago

@dennybritz you should have a blog post or a video on your YouTube channel just explaining the process to use eval.py or maybe the entire process, which would be very helpful for many of us especially the newbies (including me). Thanks :+1: :smiley:

dennybritz commented 8 years ago

Yeah, I should probably write more extensive documentation and a word2vec example. I didn't expect this many people would use the code. I'll try to put something together on the weekend.

dennybritz commented 8 years ago

Okay, I refactored the code to make eval.py easier to use. You can now just load any string text data and don't need to deal with the vocabulary generation yourself. The vocabulary is saved during training and then automatically loaded during test. Note that you need to retrain for it to work.

See https://github.com/dennybritz/cnn-text-classification-tf/commit/9ba22d2563701c3fdc8f2707372035f32e40c98f and the new eval.py for an example.

Also, I'm gonna close this for now.

trsonderm commented 7 years ago

If I run the standard model on the Rotten Tomatoes 5331/5331. Is there a way to run the eval.py on only the dev instances from the training? Or do I need to look at the summaries for that.

Chen65010445 commented 5 years ago

I changed the my own test data like this x_raw = ["./data/metatest.txt"]. But I found that in the output csv only evaluated my training data. How would I change to make eval.py could evaluate my own data?