zackhy / TextClassification

Text classification using different neural networks (CNN, LSTM, Bi-LSTM, C-LSTM).
MIT License
195 stars 60 forks source link

Using Bi-LSTM with clstm model #12

Open saja1994 opened 5 years ago

saja1994 commented 5 years ago

Thank you for your effort. Please, I want to use Bi-LSTM with clstm model. But when I use it, the following error raised `Traceback (most recent call last): File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1292, in _do_call return fn(*args) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1277, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1367, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: seq_lens(24) > input.dims(1) [[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 219, in run_step(train_input, is_training=True) File "train.py", line 198, in run_step vars = sess.run(fetches, feed_dict) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 887, in run run_metadata_ptr) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1110, in _run feed_dict_tensor, options, run_metadata) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1286, in _do_run run_metadata) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1308, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: seq_lens(24) > input.dims(1) [[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]

Caused by op 'bidirectional_rnn/bw/ReverseSequence', defined at: File "train.py", line 138, in classifier = clstm_clf(FLAGS) File "C:\Users\Saja\Desktop\TextClassification-master\TextClassification-master\clstm_classifier.py", line 133, in init sequence_length=self.sequence_length) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 466, in bidirectional_dynamic_rnn inputs_reverse = nest.map_structure(_map_reverse, inputs) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\nest.py", line 347, in map_structure structure[0], [func(x) for x in entries]) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\nest.py", line 347, in structure[0], [func(x) for x in entries]) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 464, in _map_reverse batch_axis=batch_axis) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 453, in _reverse seq_axis=seq_axis, batch_axis=batch_axis) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func return func(*args, kwargs) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func return func(*args, *kwargs) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2645, in reverse_sequence name=name) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 7984, in reverse_sequence seq_dim=seq_dim, batch_dim=batch_dim, name=name) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func return func(args, kwargs) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\ops.py", line 3272, in create_op op_def=op_def) File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\ops.py", line 1768, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): seq_lens(24) > input.dims(1) [[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]`

I took the implementation of Bi-LSTM from your code in rnn_classifier model:

`fw_cell = tf.contrib.rnn.LSTMCell(self.hidden_size) bw_cell = tf.contrib.rnn.LSTMCell(self.hidden_size)

Add dropout to LSTM cell

    fw_cell = tf.contrib.rnn.DropoutWrapper(fw_cell, output_keep_prob=self.keep_prob)
    bw_cell = tf.contrib.rnn.DropoutWrapper(bw_cell, output_keep_prob=self.keep_prob)
    # Stacked LSTMs
    fw_cell = tf.contrib.rnn.MultiRNNCell([fw_cell]*self.num_layers, state_is_tuple=True)
    bw_cell = tf.contrib.rnn.MultiRNNCell([bw_cell]*self.num_layers, state_is_tuple=True)

    self._initial_state_fw = fw_cell.zero_state(self.batch_size, dtype=tf.float32)
    self._initial_state_bw = bw_cell.zero_state(self.batch_size, dtype=tf.float32)
    with tf.name_scope('dynamic_rnn'):

        outputs, state, _ = tf.nn.static_bidirectional_rnn(
            fw_cell, 
            bw_cell,
            tf.unstack(tf.transpose(rnn_inputs, perm=[1, 0, 2])),
            initial_state_fw=self._initial_state_fw,
            initial_state_bw=self._initial_state_bw,
            sequence_length=self.sequence_length,
            #dtype=tf.float32,
            scope='BiLSTM'
            )
        #outputs = tf.reshape(outputs, [-1, self.hidden_size * 2])
        self.outputs = outputs

    out, state = tf.nn.bidirectional_dynamic_rnn(fw_cell,
                                                   bw_cell,
                                                   inputs=rnn_inputs,
                                                   initial_state_fw=self._initial_state_fw,
                                                   initial_state_bw=self._initial_state_bw,
                                                   sequence_length=self.sequence_length)

    state_fw = state[0]
    state_bw = state[1]
    output = tf.concat([state_fw[self.num_layers - 1].h, state_bw[self.num_layers - 1].h], 1)

    self.final_state=output

    # Softmax output layer
    with tf.name_scope('softmax'):

        softmax_w = tf.get_variable('softmax_w', shape=[2 * self.hidden_size, self.num_classes], dtype=tf.float32)
        softmax_b = tf.get_variable('softmax_b', shape=[self.num_classes], dtype=tf.float32)

        # L2 regularization for output layer
        self.l2_loss += tf.nn.l2_loss(softmax_w)
        self.l2_loss += tf.nn.l2_loss(softmax_b)

        # logits
        self.logits = tf.matmul(self.final_state, softmax_w) + softmax_b
        predictions = tf.nn.softmax(self.logits)
        self.predictions = tf.argmax(predictions, 1, name='predictions')`