dennybritz / cnn-text-classification-tf

Convolutional Neural Network for Text Classification in Tensorflow
Apache License 2.0
5.65k stars 2.77k forks source link

how to test the trained module? #8

Closed haridatascientist closed 8 years ago

haridatascientist commented 8 years ago

i trained the movie review training set using this code. i got trained files in the path "runs/1458022294/summaries/train". how can i test the module is there any API in python to test it?

dennybritz commented 8 years ago

Summaries are what you can visualize in Tensorboard. To load the trained model you need to load the checkpoint files. Check out the official documentation on that: https://www.tensorflow.org/versions/r0.7/how_tos/variables/index.html#variables-creation-initialization-saving-and-loading

zhuantouer commented 8 years ago

@dennybritz hi, there: you use tf.train.Saver() to save all variables in train.py but your W is defined in text_cnn.py does the tensorflow save these variables also the vocabulary, vocabulary_inv that is not defined by using tf.Variable() does the tensorflow save these? thanks in advance~

dennybritz commented 8 years ago

@zishell Check out the Tensorflow documentation for storing and saving variables: https://www.tensorflow.org/versions/r0.7/how_tos/variables/index.html#variables-creation-initialization-saving-and-loading

In short, all the model parameters (W, etc) are saved by Tensorflow because they are part of the implicit graph. However, the vocabulary isn't saved, you need to load it yourself. If it's not clear I recommend reading the Tensorflow documentation, it explains how variables and graphs are working.

zhuantouer commented 8 years ago

@dennybritz thanks, I have understood of save and restore. This is my script to classify one sentence:

if __name__ == '__main__':
    with tf.Session() as sess:

        # process the raw sentence 
        new_review = "simplistic , silly and tedious . "
        new_review = new_review.strip()
        new_review = data_helpers.clean_str(new_review)
        new_review = new_review.split(" ")

        sentences, dump= data_helpers.load_data_and_labels()
        sequence_length = max(len(x) for x in sentences)
        sentences_padded = data_helpers.pad_sentences(sentences)
        vocabulary, vocabulary_inv = data_helpers.build_vocab(sentences_padded)

        num_padding = sequence_length - len(new_review)
        new_sentence = new_review + ["<PAD/>"] * num_padding

        #convert x 
        x = np.array([vocabulary[word] for word in new_sentence])
        input_x = (x)

        sequence_length = x.shape[0]
        vocab_size = len(vocabulary)
        embedding_size = FLAGS.embedding_dim
        filter_sizes = map(int, FLAGS.filter_sizes.split(","))
        num_filters = FLAGS.num_filters
        l2_reg_lambda = FLAGS.l2_reg_lambda

        w_embedding = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0), name="embedding/W")
        embedded_chars = tf.nn.embedding_lookup(w_embedding, input_x)
        embedded_chars_expanded = tf.expand_dims(embedded_chars, -1)
        # Restore variables from disk.
        saver = tf.train.Saver()
        saver.restore(sess, "runs/1459166181/checkpoints/model-20000")
        print("Model restored.")

        # Create a convolution + maxpool layer for each filter size
        pooled_outputs = []
        for i, filter_size in enumerate(filter_sizes):
            with tf.name_scope("conv-maxpool-%s" % filter_size):
                # Convolution Layer
                filter_shape = [filter_size, embedding_size, 1, num_filters]
                W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
                b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")
                # Restore variables from disk.
                saver = tf.train.Saver()
                saver.restore(sess, "runs/1459166181/checkpoints/model-20000")
                print("Model restored.")

                conv = tf.nn.conv2d(
                    embedded_chars_expanded,
                    W,
                    strides=[1, 1, 1, 1],
                    padding="VALID",
                    name="conv")
                # Apply nonlinearity
                h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")
                # Maxpooling over the outputs
                pooled = tf.nn.max_pool(
                    h,
                    ksize=[1, sequence_length - filter_size + 1, 1, 1],
                    strides=[1, 1, 1, 1],
                    padding='VALID',
                    name="pool")
                pooled_outputs.append(pooled)

        # Combine all the pooled features
        num_filters_total = num_filters * len(filter_sizes)
        h_pool = tf.concat(3, pooled_outputs)
        h_pool_flat = tf.reshape(h_pool, [-1, num_filters_total])

        # Add dropout
        h_drop = tf.nn.dropout(h_pool_flat, 0.5)

        # Final (unnormalized) scores and predictions
        W = tf.Variable(tf.truncated_normal([num_filters_total, num_classes], stddev=0.1), name="output/W")
        b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="output/b")
        # Restore variables from disk.
        saver = tf.train.Saver()
        saver.restore(sess, "runs/1459166181/checkpoints/model-20000")
        print("Model restored.")

        scores = tf.nn.xw_plus_b(h_drop, W, b, name="scores")
        predictions = tf.argmax(scores, 1, name="predictions")
        print predictions

I have two questions:

  1. How to convert x to input_x, when I debug the train.py, found that x_batch is tuple, so I convert the x to (x), is that right
w_embedding = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0), name="embedding/W")
embedded_chars = tf.nn.embedding_lookup(w_embedding, input_x)
embedded_chars_expanded = tf.expand_dims(embedded_chars, -1)

the dimension of embedded_chars is [56,128] not like yours [?,56,128] and the dimension of embedded_chars_expanded is [56,128,1] not like yours [?,56,128,1]

How can I convert it, thanks

xumx commented 8 years ago

@dennybritz I've read the Tensorflow documentation, but still couldn't quite get it right.

Since many people have asked similar question in the tutorial's comments section. Is it possible for you to update the sample code with an eval example?

dennybritz commented 8 years ago

Yeah, maybe that's a good idea. I'll try to get it done this weekend.

AAMIBhavya commented 8 years ago

@dennybritz It will be very helpful if you share the work you doing now as soon as possible..thanks in advance

@zishell I tried your code for testing that particular sentence, but i got an error like

NameError:

name 'FLAGS' is not defined

How can I resolve this?

zhuantouer commented 8 years ago

@AAMIBhavya I did not paste the head of my script, you should add the flags by yourself copy from the train.py

dennybritz commented 8 years ago

Okay, I added a sample eval.py. Also added a note to the README. Let me know if this works.

EDIT: Also, make sure you pull the latest code changes, I added an additional flag to the data helpers that's needed for the eval.

AAMIBhavya commented 8 years ago

@dennybritz Thanks, when I run eval.py, I got output like this `BATCH_SIZE=64 CHECKPOINT_DIR=/home/product/Downloads/cnn-text-classification-tf-master/runs/1459745478/checkpoints/ LOG_DEVICE_PLACEMENT=False

Loading data... Vocabulary size: 18765 Test set size 10662

Evaluating...

Total number of test examples: 10662 Accuracy: 0 product@product:~/Downloads/cnn-text-classification-tf-master$ python eval.py

Parameters: ALLOW_SOFT_PLACEMENT=True BATCH_SIZE=64 CHECKPOINT_DIR=/home/product/Downloads/cnn-text-classification-tf-master/runs/1459745478/checkpoints/ LOG_DEVICE_PLACEMENT=False

Loading data... [[ 1 571 7 ..., 0 0 0] [ 1 3805 2181 ..., 0 0 0] [ 718 13 44 ..., 0 0 0] ..., [ 12 9 1474 ..., 0 0 0] [ 1 166 439 ..., 0 0 0] [3308 7 63 ..., 0 0 0]] [1 1 1 ..., 0 0 0] Vocabulary size: 18765 Test set size 10662

Evaluating...

Total number of test examples: 10662 Accuracy: 0`

Why I got accuracy as 0 (I just printed the values of x_test and y_test also). And one more thing, how can I test a sentence which is not in training sample, since it doesn't have any value for y_test know?

AAMIBhavya commented 8 years ago

@zishell Thanks. I corrected that part. But I got error after restoring the trained model as

**Model restored. Model restored. Traceback (most recent call last): File "test.py", line 90, in name="conv") File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 211, in conv2d use_cudnn_on_gpu=use_cudnn_on_gpu, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 655, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2042, in create_op set_shapes_for_outputs(ret) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1528, in set_shapes_for_outputs shapes = shape_func(op) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/common_shapes.py", line 176, in conv2d_shape input_shape = op.inputs[0].get_shape().with_rank(4) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 616, in with_rank raise ValueError("Shape %s must have rank %d" % (self, rank)) ValueError: Shape (56, 128, 1) must have rank 4

What this value error corresponds to? How can I resolve this?

dennybritz commented 8 years ago

@AAMIBhavya You may need to pull the updated code and retrain the model. The eval script I wrote only works with Tensorflow 0.7, so make sure you have the latest version.

For testing a single sentence, it should be easy to modify the eval script for that. Just change the data loading section to load your own sentence and then remove the accuracy calculation and print out the prediction instead.

AAMIBhavya commented 8 years ago

@dennybritz Thanks. Actually the version of Tensorflow I am using is 0.7.1

kinarashah commented 8 years ago

Hello, I tried running the eval.py giving the path to checkpoint folder, and getting the following error. I am new at neural networks, could you please look into the error?

Traceback (most recent call last):
  File "eval.py", line 68, in <module>
    batch_predictions = sess.run(predictions, {input_x: x_test_batch, dropout_keep_prob: 1.0})
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 315, in run
    return self._run(None, fetches, feed_dict)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 511, in _run
    feed_dict_string)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 564, in _do_run
    target_list)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 586, in _do_call
    e.code)
tensorflow.python.framework.errors.InvalidArgumentError: computed output size would be negative
     [[Node: conv-maxpool-5/pool = MaxPool[ksize=[1, 55, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"](conv-maxpool-5/relu)]]
Caused by op u'conv-maxpool-5/pool', defined at:
  File "eval.py", line 50, in <module>
    saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1285, in import_meta_graph
    return _import_meta_graph_def(_read_meta_graph_file(meta_graph_or_file))
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1220, in _import_meta_graph_def
    importer.import_graph_def(meta_graph_def.graph_def, name="")
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 238, in import_graph_def
    compute_shapes=False, compute_device=False)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2040, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1087, in __init__
    self._traceback = _extract_stack()
JamesRedfield commented 8 years ago

Hi guys! Anyone has understood how can use eval.py to test a single sentence ( present in training sample ). I just change the data loading section to load one sentence. but I feel confused when I try to print out the prediction.

Loading data... Vocabulary size: 18762 Test set size 10662

Evaluating...

[[1207 2 304 4 660 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

It corresponds to "simplistic , silly and tedious ." Now i don't know how use y_test fro prediction. Please help me

dennybritz commented 8 years ago

@AAMIBhavya It seems strange that you would get an accuracy of 0. I just tested this locally and it's working for me if I train and then point to the checkpoint dir. I'm using the latest version of Tensorflow and Python 3.

@kinarashah I haven't seen this before, but could you try train + eval with Python 3 and the latest version of Tensorflow?

@JamesRedfiled You don't need the y_test for predictions. These are the labels and they are only used to calculate the accuracy. To get the predictions you can just print out the all_predictions variable.

JamesRedfield commented 8 years ago

@dennybritz Thanks a lot! I just tried to print 'all predictions' with always the same input sentence..( exactly sentence pick up from sample data rt-polarity.neg ). The response is never equal..

1 ) Sometimes return 1 or 0. It depends from different representations of the same input. Is it right ?

2 ) Why the representation of the same sentence changes every time? The "build_vocab(sentences)" function changes at every execution. Why?

3) Is it depends by mapping of words with same most_common count value?

For example - the first word 'simplistic' is mapped like: 1192, then 1237 or then 1191. How is possible this little difference generates a totally difference response at the end ( positive or negative sentence)?

There are two output response from the identical input sentence ( "simplistic , silly and tedious ." )


$ python3.5 eval.py

Loading data... Vocabulary size: 18762 Test set size 10662

sentence - Mapped sentence to vector based on a vocabulary. [[1192 2 303 4 666 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

Evaluating... [ 1.]


$ python3.5 eval.py

Loading data... Vocabulary size: 18762 Test set size 10662

sentence - Mapped sentence to vector based on a vocabulary. [[1237 2 308 4 661 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

Evaluating... [ 0.]


Thank you for your time Denny!

dennybritz commented 8 years ago

@JamesRedfield Thanks, I'll look into it today and get back to you. The predictions should be equal, so something may be wrong there.

dennybritz commented 8 years ago

@JamesRedfield Thanks for finding this. It seems like most_common method does random sorting for words that occur the same number of times. So words may receive a different index during training and eval, resulting in wrong predictions.

I fixed this in https://github.com/dennybritz/cnn-text-classification-tf/commit/67272960d2e430f63cfe9886f3b849a0b8202c2b

Let me know if it works! (you need to retrain)

JamesRedfield commented 8 years ago

@dennybritz Thanks for your quickly answer! I used your code after retrain data. The output have more sense now but i tested your algorithm with all negative sentences ( from sample data ) in input:

I am aspected almost all same prediction [0.] instead the response is ( more Zero and some Ones ). - Is it depend from size of dataset? Is it right?

Output:

Loading data... Vocabulary size: 18765 Input all negative Sentences : 5331

Evaluating... Positives: 176 Negatives: 5155

We have 3,3% of not true response.

I hope my explain is clear. Thanks!

AAMIBhavya commented 8 years ago

@dennybritz When we use new test data, there may be new terms which is not in the vocabulary. How it can be handled? Can that new terms also include in that vocabulary?

JamesRedfield commented 8 years ago

Hi @AAMIBhavya. In my case, the solution is to ignore the new terms because is not mapped. If you want to include the new terms, you 'll need to retrain the data set with new sentences with new terms.

AAMIBhavya commented 8 years ago

@JamesRedfield Thanks..

dennybritz commented 8 years ago

@AAMIBhavya Yes, the most common way is to limit the vocabulary to a certain number of most common words (e.g. 10k or 20k) and replace all other words with a special UNK word/token, both during training and test. All words not in the vocab are then treated the same.

A more sophisticated approach is to use character-level models to "generalize" to new words, e.g. http://arxiv.org/abs/1602.00367

fmaglia commented 8 years ago

Hi. I also tried the eval script but I have the same problem of @kinarashah

`W tensorflow/core/kernels/pooling_ops_common.cc:64] Invalid argument: computed output size would be negative W tensorflow/core/common_runtime/executor.cc:1102] 0x2bcda30 Compute status: Invalid argument: computed output size would be negative [[Node: conv-maxpool-5/pool = MaxPoolksize=[1, 32, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]] W tensorflow/core/kernels/pooling_ops_common.cc:64] Invalid argument: computed output size would be negative W tensorflow/core/common_runtime/executor.cc:1102] 0x2bcda30 Compute status: Invalid argument: computed output size would be negative [[Node: conv-maxpool-4/pool = MaxPoolksize=[1, 33, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]] W tensorflow/core/kernels/pooling_ops_common.cc:64] Invalid argument: computed output size would be negative W tensorflow/core/common_runtime/executor.cc:1102] 0x2bcda30 Compute status: Invalid argument: computed output size would be negative [[Node: conv-maxpool-3/pool = MaxPoolksize=[1, 34, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]] Traceback (most recent call last): File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 571, in _do_call return fn(*args) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 555, in _run_fn return tf_session.TF_Run(session, feed_dict, fetch_list, target_list) tensorflow.python.pywrap_tensorflow.StatusNotOK: Invalid argument: computed output size would be negative [[Node: conv-maxpool-5/pool = MaxPoolksize=[1, 32, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "eval.py", line 69, in batch_predictions = sess.run(predictions, {input_x: x_test_batch, dropout_keep_prob: 1.0}) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 315, in run return self._run(None, fetches, feed_dict) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 511, in _run feed_dict_string) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 564, in _do_run target_list) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 586, in _do_call e.code) tensorflow.python.framework.errors.InvalidArgumentError: computed output size would be negative [[Node: conv-maxpool-5/pool = MaxPoolksize=[1, 32, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]] Caused by op 'conv-maxpool-5/pool', defined at: File "eval.py", line 51, in saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file)) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/training/saver.py", line 1285, in import_meta_graph return _import_meta_graph_def(_read_meta_graph_file(meta_graph_or_file)) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/training/saver.py", line 1220, in _import_meta_graph_def importer.import_graph_def(meta_graph_def.graph_def, name="") File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/importer.py", line 238, in import_graph_def compute_shapes=False, compute_device=False) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 2040, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 1087, in init self._traceback = _extract_stack()`

I tried both with Python 2.7 and Python 3.4 and I have the same problem. How I can fix?

dennybritz commented 8 years ago

@fmaglia I'm not sure what could be wrong there. I haven't seen this error before.

Are you using the latest version of Tensorflow? What are you running on (Linux/Windows/Mac)? Are you using your own data are only the code here?

fmaglia commented 8 years ago

I'm using the latest version of Tensorflow. I'm running on Linux Ubuntu 14.04 LTS. I'm using my own data (tweet of SemEval competition). Thanks for the quick reply.

dataset.zip

dennybritz commented 8 years ago

Does the code work for you with the default data here? Are you sure you have the same vocabulary for both training and test data, and that the test data doesn't contain any out of vocabulary words?

fmaglia commented 8 years ago

I didn't tried the code with the default data. I have different vocabulary for training and test data because the dataset are different. One is for the training of the classifier, the second is for the testing of the neural network. Why the vocabulary should be the same?

dennybritz commented 8 years ago

The vocabulary must be the same or you can't index the embeddings. That's standard for almost all models (other than those that explicitly generalize to new terms, like character models). It has been discussed above, see https://github.com/dennybritz/cnn-text-classification-tf/issues/8#issuecomment-207511797

Limit the vocabulary to the N most common terms and use them for your train/test.

nplevitt commented 8 years ago

@AAMIBhavya I was having issues with the eval script saying that accuracy was 0 as well, but I went in and changed correct_predictions and len(y_test) to floats instead of ints and that solved the problem for me.

dennybritz commented 8 years ago

@NickLevitt That makes sense, thanks for figuring that out. I changed it to float in the code now. Strange that it worked for me without though.

AAMIBhavya commented 8 years ago

@dennybritz Thanks for your quick replies..I corrected all the errors and I also tested with my own test corpus..it gave good results. Good work.. @NickLevitt Thanks..I was having that accuracy 0 issue, but I discover that it was because of python 2. When I run that code using python3 I got the correct value for accuracy. That _accuracy 0 _problem comes only when you use python 2 (since there is a need for type casting). So try with python 3..

nishantvishwamitra commented 8 years ago

@dennybritz I'm also facing the same issue as @fmaglia . What I did to evaluate, was to keep just the first sentence in both the pos and neg files and run the eval.py for this data.

The code works perfectly fine for the complete dataset and gives accuracy of 0.973551 as mentioned by you. Please let me know if you have any pointers for me. Thanks.

dennybritz commented 8 years ago

@nishantvishwamitra You must have the same vocabulary during training and test. In eval, you need to build the full vocabulary from the training data, and then map your test sentences in that vocabulary.

fmaglia commented 8 years ago

I try to solve my problem using the vocabulary of the training test also for the test set and removing words of test set that aren't in the training set but when I launch the eval script : `W tensorflow/core/kernels/pooling_ops_common.cc:64] Invalid argument: computed output size would be negative W tensorflow/core/common_runtime/executor.cc:1102] 0x18a1fb0 Compute status: Invalid argument: computed output size would be negative [[Node: conv-maxpool-3/pool = MaxPoolksize=[1, 34, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]] W tensorflow/core/kernels/pooling_ops_common.cc:64] Invalid argument: computed output size would be negative W tensorflow/core/common_runtime/executor.cc:1102] 0x18a1fb0 Compute status: Invalid argument: computed output size would be negative [[Node: conv-maxpool-5/pool = MaxPoolksize=[1, 32, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]] W tensorflow/core/kernels/pooling_ops_common.cc:64] Invalid argument: computed output size would be negative W tensorflow/core/common_runtime/executor.cc:1102] 0x18a1fb0 Compute status: Invalid argument: computed output size would be negative [[Node: conv-maxpool-4/pool = MaxPoolksize=[1, 33, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]] Traceback (most recent call last): File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 571, in _do_call return fn(*args) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 555, in _run_fn return tf_session.TF_Run(session, feed_dict, fetch_list, target_list) tensorflow.python.pywrap_tensorflow.StatusNotOK: Invalid argument: computed output size would be negative [[Node: conv-maxpool-3/pool = MaxPoolksize=[1, 34, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "eval.py", line 70, in batch_predictions = sess.run(predictions, {input_x: x_test_batch, dropout_keep_prob: 1.0}) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 315, in run return self._run(None, fetches, feed_dict) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 511, in _run feed_dict_string) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 564, in _do_run target_list) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 586, in _do_call e.code) tensorflow.python.framework.errors.InvalidArgumentError: computed output size would be negative [[Node: conv-maxpool-3/pool = MaxPoolksize=[1, 34, 1, 1], padding="VALID", strides=[1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/cpu:0"]] Caused by op 'conv-maxpool-3/pool', defined at: File "eval.py", line 51, in saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file)) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/training/saver.py", line 1285, in import_meta_graph return _import_meta_graph_def(_read_meta_graph_file(meta_graph_or_file)) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/training/saver.py", line 1220, in _import_meta_graph_def importer.import_graph_def(meta_graph_def.graph_def, name="") File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/importer.py", line 238, in import_graph_def compute_shapes=False, compute_device=False) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 2040, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 1087, in init self._traceback = _extract_stack() ` Have you any idea about what causing error?

JamesRedfield commented 8 years ago

Hi @dennybritz ! The embedding implementation doesn’t currently have GPU support and throws an error if placed on the GPU.

Is it means that I can't use GPU anyway? I tried your algorithm on AMI and I'm' trying to figure that out.. but the result is:

W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:211] Ran out of memory trying to allocate 1.62MiB. See logs for memory state

W ./tensorflow/core/kernels/assign_op.h:77] Resource exhausted: OOM when allocating tensor with shape[4,128,1,128] W tensorflow/core/common_runtime/executor.cc:1102] 0x274a8c0 Compute status: Resource exhausted: OOM when allocating tensor with shape[4,128,1,128]

thanks

dennybritz commented 8 years ago

@JamesRedfield You can use the GPU, just not for the embeddings. Tensorflow is is missing support for the necessary operations. Maybe they fixed it in the latest version so you cold try to upgrade.

The rest should work fine on the GPU, are you sure it's set up correctly? Do the other Tensorflow examples run on it? Running TF on AWS GPU instances is a bit tricky, but a lot has been written about it online.

JamesRedfield commented 8 years ago

I believe all is correct. I had some problem to use tensorflow 0.8 with CUDA 7.5 and cudnn 5.0. In my case, It was very tricky run a simple example like this: import tensorflow as tf a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) print sess.run(c)

to use GPU.

At the end, after I spent 3-4 days to read moooore forum I realized one configuration, tensor flow 0.7 - CUDA 7.0 - CUDNN 4.0. I run on python3.4. This is my case..

Now I run your code after reboot the server but I see something not correct. Your experience, in this case, is very precious.

This is the process USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ubuntu 2177 143 2.4 39720844 383504 pts/0 Sl+ 14:14 8:26 python3.4 train.py

Nvidia information NVIDIA-SMI 346.46 Driver Version: 346.46 |
| Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2177 C python3.4 3819MiB |

but the training time is exactly the same. I'm using GPU device with tf.device('/gpu:0'), tf.name_scope("embedding"):

Do you know where I wrong? Any ideas?

I forgot it..Thanks for your support

dennybritz commented 8 years ago

As I mentioned above, you can't use the GPU for the embeddings with tf.device('/gpu:0'), tf.name_scope("embedding"): - you need to use the CPU (at least that was case with TF a while ago).

You can run the rest of the network on the GPU.

JamesRedfield commented 8 years ago

Oh sorry! I understand ! I believed the 0.7 tensorflow is relative new to resolve this trouble but when you said Maybe they fixed in the "latest version" , you referred to 0.8. Thanks a lot!

dennybritz commented 8 years ago

@JamesRedfield I'm not sure, according to https://github.com/tensorflow/tensorflow/issues/305 it seems like they added support for it. So I guess it should work...

Your error Ran out of memory trying to allocate 1.62MiB. See logs for memory state doesn't really make sense to me. Is there no memory left on the GPU? It seems like other people have had similar issues: https://github.com/tensorflow/tensorflow/issues/398

This seems like more of a Tensorflow issue, not an issue with this code. So maybe ask in the Tensorflow group and print out the detailed debug logs..

JamesRedfield commented 8 years ago

I'll look around before changing my "safe" setting. :) Thank for you incredible support @dennybritz

JamesRedfield commented 8 years ago

Hi @dennybritz.
If I have, for example, positive rt-polarity data file with 5000 lines and negative rt-polarity file with 2000 lines... And if I duplicate the negative lines to reach 5000 lines ( like positive input ). How change the results or the quality of algorithm. Is it still reliable?

dennybritz commented 8 years ago

@JamesRedfield You are iterating through the training files multiple times anyway, just duplicating the data won't do anything (assuming you train long enough)

JamesRedfield commented 8 years ago

Hello @dennybritz. I have another question and I hope not to be tedious. :) The algorithm does not care about a specific language? I mean, you do not used pre-trained word2vec vectors for word embeddings. Instead, you learn embeddings from scratch. You apply convolutions directly to one-hot vectors so the algorithm is not specific natural language oriented. Is it right? I trained more data in a different language( 10.000 lines - 5000 pos - 5000 neg). At the end the response is not good. I mean, If I submit a new set of negative sentences, similar to the negative trained sentences, the response is statistically in the middle so half is classified like negative and half like positive. Do you known why? What do you think?

dennybritz commented 8 years ago

Yes, it's not language specific and should work as long as the language is not too different (like Chinese where you probably need to train on cahracters instead of words). Make sure that the vocabulary is big enough and that your test set doesn't use completely different vocabulary.

Btw, the convolutions are applied to the embedded vectors, not the one-hot vectors.

JamesRedfield commented 8 years ago

Hi @dennybritz

The vocabolary size is: 13K The number of sentences is: 9000 ( 4500 are positive - 1200 are negative [4500 with duplication] )

Test set: new negative sentences are 60 words: 1100 of which 290 are no mapped So the average of words not mapped, in new sentences, is 30%.

But in some cases, the number of words not mapped in a new NEGATIVE sentence, is zero, so the response is the same POSITIVE. Is it makes sense for you?

The result of elaboration is half sentences classified like positive and half negative.I tried with an different test set size ( like 300 new negative sentences) but the result is exactly the same.

What do you think? Is the vocabulary big enough?

ps. when I discover a new word not mapped, during eval.py elaboration I put a PAD character like a value thanks

dennybritz commented 8 years ago

@JamesRedfield This seems reasonable to me, not sure what could be wrong there.

SeanMaday commented 8 years ago

Can someone share the syntax they used to test their own content (line 30 in eval.py)? I am curious about testing one sentence and also testing a batch of sentences.