rkcosmos / deepcut

A Thai word tokenization library using Deep Neural Network
MIT License
420 stars 96 forks source link

ValueError: Tensor is not an element of this graph. #32

Closed goiPP closed 6 years ago

goiPP commented 6 years ago

I have used deepcut.tokenize as an analyzer in a CountVectorizer and it raises an error

File "/model/CountVectorizer.py", line 14, in cutWord
    return [word for word in deepcut.tokenize(original_text) if word not in stop_list]
  File "/usr/local/lib/python3.6/dist-packages/deepcut/deepcut.py", line 60, in tokenize
    y_predict = model.predict([x_char, x_type])
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1832, in predict
    self._make_predict_function()
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1029, in _make_predict_function
    **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2502, in function
    return Function(inputs, outputs, updates=updates, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2445, in __init__
    with tf.control_dependencies(self.outputs):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 4863, in control_dependencies
    return get_default_graph().control_dependencies(control_inputs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 4481, in control_dependencies
    c = self.as_graph_element(c)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3478, in as_graph_element
    return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3557, in _as_graph_element_locked
    raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("dense_14/Sigmoid:0", shape=(?, 1), dtype=float32) is not an element of this graph.

So I don't know could this is the problem with deepcut using keras or not.?? reference issue -> https://github.com/keras-team/keras/issues/2397

titipata commented 6 years ago

Oh, you cannot use deepcut.tokenize in CountVectorizer! It's a completely different tokenizer that scikit learn requires. So there are 2 ways to turn your text into bag-of-words format:

1) Use DeepcutTokenizer, see example in https://github.com/rkcosmos/deepcut/blob/master/deepcut/deepcut.py#L116-L120 (you can also use unigram/ bigram in this case). 2) You can tokenize text yourself and transform it to sparse matrix. I wrote a blog post about it at https://tupleblog.github.io/deepcut-classify-news/

Let me know if you have future questions.

goiPP commented 6 years ago

Sorry, it turns out to be issue of my server instead. And thanks for your blog suggestion.