hsekol-hub commented 4 years ago

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Google Colab

**Hello, I am trying to build a model that relies on the concept of virtual adversarial training for text classification. The model involves the functional API of Keras and I have been to build a model graph. However, there is something wrong when I try to fit my training dataset on the model. Please suggest to me where I might be going wrong in the process.

Virtual Adversarial Training involves calculating KL DIvergence between two probability distribution. The shape of train data is displayed as well to get a better understanding about the problem. The code has been inspired from the following link but I have changed the domain to text preprocessing and is more advanced.

More information: Code: [https://gist.github.com/divamgupta/c778c17459c1f162e789560d5e0b2f0b] Theory: [https://divamgupta.com/unsupervised-learning/semi-supervised-learning/2019/05/31/introduction-to-virtual-adversarial-training.html]

Google Colab link for below code: [https://colab.research.google.com/drive/1sdv-sGE80HwdPKkl_p3gakhkxgKO-zRv?usp=sharing] **

CODE

` import numpy as np import random import time

------------------- Tensorflow

import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers from tensorflow.keras.models import Model, Sequential from tensorflow.keras.layers import Input, Embedding, Dense, LSTM, Bidirectional

MAX_VOCAB_SIZE = len(word_index) + 1 # maximum no of unique words MAX_DOC_LENGTH = 500 # maximum no of words in each sentence EMBEDDING_DIM = 300 # Embeddings dimension from Glove directory

def compute_kld(p_logit, q_logit): p = tf.nn.softmax(p_logit) q = tf.nn.softmax(q_logit) kl_score = tf.reduce_sum( p * (tf.math.log(p+1e-16) - tf.math.log(q+1e-16)), axis = 1) return kl_score # lower kl means closer the distributions are

inputs = Input(shape=(MAX_DOC_LENGTH,), name="Seq_Input") # Text Sequence is the first input

inputs = tf.Variable(tf.zeros(shape=(MAX_DOC_LENGTH,)))

def createEmbd(inputs): # Creates Embeddings for a sequence of words return layers.Embedding(input_dim=MAX_VOCAB_SIZE, output_dim=EMBEDDING_DIM, input_length = MAX_DOC_LENGTH, trainable=True, mask_zero=True, name="Keras_Embedding")(inputs)

input_emb = createEmbd(inputs) noise_emb = tf.random.uniform(shape=tf.shape(input_emb)) # Idea is to add noise to these embeddings

noise_emb = tf.math.add(input_emb, noise_emb)

noise_emb = input_emb + noise_emb

input_h1 = layers.LSTM(units=128,name="Input_h1")(input_emb) noise_h1 = layers.LSTM(units=128,name="Noise_h1")(noise_emb)

p_logit = layers.Dense(units=16, activation='relu', name="p_logit")(input_h1) p_logit_r = layers.Dense(units=16, activation='relu', name="p_logit_r")(noise_h1)

with tf.GradientTape(watch_accessed_variables=False) as tape: tape.watch(noise_emb) kl_score = compute_kld(p_logit, p_logit_r) kl_score = tf.convert_to_tensor(kl_score, dtype=tf.float32) grads = tape.gradient(kl_score, noise_emb) # Differentiate kl_score with respect to noise_embd

.#p_logit = tf.stop_gradient(p_logit) .#p_logit_r = tf.stop_gradient(p_logit_r)

.# Due to some reason the first execution returned "None" for gradients so manually added the shape to be able to build the model if grads is None: grads = tf.random.uniform(shape=tf.shape(noise_emb))

vadv_emb = tf.math.add(input_emb, grads) vadv_h1 = layers.LSTM(units=128,name="vadv_h1")(vadv_emb) q_logit = layers.Dense(units=16, activation='relu', name="q_logit")(vadv_h1)

vat_loss = compute_kld(p_logit, q_logit) # I need to add this vat loss(Scalar) to the final cost function

.# logits = layers.average([p_logit, p_logit_r, q_logit]) outputs = layers.Dense(units=1, activation='softmax', name="output")(p_logit) model = keras.Model(inputs, outputs)

model.add_loss(vat_loss)

.# Not sure if this graph has any problem keras.utils.plot_model(model, show_shapes=True, show_layer_names=True)

model.summary()

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),loss= 'binary_crossentropy',metrics=['accuracy','precision'])

Shuffle data random before splitting

indices = np.arange(sequences.shape[0]) random.Random(1).shuffle(indices) data = sequences[indices] labels = y[indices]

num_test_samples = int(0.2 * data.shape[0]) x_train = data[:-num_test_samples] y_train = labels[:-num_test_samples] x_test = data[-num_test_samples:] y_test = labels[-num_test_samples:] print(x_train.shape, y_train.shape) print(x_train[0].shape, y_train[0].shape)

Output: (400, 500) (400,) (500,) () train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)) train_dataset.take(1)

output <TakeDataset shapes: ((500,), ()), types: (tf.int32, tf.int64)>

model.fit(train_dataset, epochs=2, batch_size=40)

model.fit(x_train, y_train, epochs=2, validation_split=0.2, shuffle=True, batch_size=32)

`

Any other info/logs Error: Epoch 1/2 WARNING:TensorFlow:Model was constructed with shape (None, 500) for input Tensor("Seq_Input:0", shape=(None, 500), dtype=float32), but it was called on an input with incompatible shape (500, 1).

ValueError Traceback (most recent call last)

in () ----> 1 model.fit(train_dataset, epochs=2, batch_size=40) 2 #model.fit(x_train, y_train, epochs=2, validation_split=0.2, shuffle=True, batch_size=32) 10 frames /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs) 966 except Exception as e: # pylint:disable=broad-except 967 if hasattr(e, "ag_error_metadata"): --> 968 raise e.ag_error_metadata.to_exception(e) 969 else: 970 raise ValueError: in user code: /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:571 train_function * outputs = self.distribute_strategy.run( /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:951 run ** return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica return self._call_for_each_replica(fn, args, kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica return fn(*args, **kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:533 train_step ** y, y_pred, sample_weight, regularization_losses=self.losses) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/compile_utils.py:213 __call__ batch_dim = array_ops.shape(y_t)[0] /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py:984 _slice_helper name=name) /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py:1150 strided_slice shrink_axis_mask=shrink_axis_mask) /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py:10179 strided_slice shrink_axis_mask=shrink_axis_mask, name=name) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:744 _apply_op_helper attrs=attr_protos, op_def=op_def) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py:595 _create_op_internal compute_device) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:3327 _create_op_internal op_def=op_def) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:1817 __init__ control_input_ops, op_def) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:1657 _create_c_op raise ValueError(str(e)) ValueError: slice index 0 of dimension 0 out of bounds. for '{{node strided_slice}} = StridedSlice[Index=DT_INT32, T=DT_INT32, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1](Shape, strided_slice/stack, strided_slice/stack_1, strided_slice/stack_2)' with input shapes: [0], [1], [1], [1] and with computed input tensors: input[1] = <0>, input[2] = <1>, input[3] = <1>.**

ravikyram commented 4 years ago

@lokesharma-dev

Request you to share TF version you are using?.Please, share colab link or simple standalone code with proper indentation and supporting files to reproduce the issue in our environment.It helps us in localizing the issue faster.Thanks!

hsekol-hub commented 4 years ago

@ravikyram Tensorflow 2.2.0 Colab link: https://colab.research.google.com/drive/1sdv-sGE80HwdPKkl_p3gakhkxgKO-zRv?usp=sharing

The files required are attached. In case anything is not accessible. Kindly notify. Thanks for your time.

npy_files.zip

ravikyram commented 4 years ago

I have tried in colab with TF version 2.2, nightly (2.3.0-dev20200614) and was able to reproduce the issue.Please, find the gist here.Thanks!

google-ml-butler[bot] commented 4 years ago

Are you satisfied with the resolution of your issue? Yes No

tensorflow / tensorflow

ValueError: Unable to fit model on a built graph. #40417