keras-team / keras-contrib

Keras community contributions
MIT License
1.58k stars 650 forks source link

questions on padding and masking for the value=1000 #553

Open KNiu00 opened 3 years ago

KNiu00 commented 3 years ago

the code is as follows. It is from https://www.tensorflow.org/guide/keras/masking_and_padding

import numpy as np import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers

raw_inputs = [ [711, 632, 71], [73, 8, 3215, 55, 927], [83, 91, 1, 645, 1253, 927], ]

By default, this will pad using 0s; it is configurable via the

"value" parameter.

Note that you could "pre" padding (at the beginning) or

"post" padding (at the end).

We recommend using "post" padding when working with RNN layers

(in order to be able to use the

CuDNN implementation of the layers).

padded_inputs = tf.keras.preprocessing.sequence.pad_sequences( raw_inputs, padding="post",value=1000 ) print(padded_inputs)

The output is

[[ 711 632 71 1000 1000 1000] [ 73 8 3215 55 927 1000] [ 83 91 1 645 1253 927]]

Then, I want to create an embedding layer.

embedding = layers.Embedding(input_dim=5000, output_dim=16, mask_zero=True) masked_output = embedding(padded_inputs)

print(masked_output._keras_mask)

when value=0, I can have the correct output: tf.Tensor( [[ True True True False False False] [ True True True True True False] [ True True True True True True]], shape=(3, 6), dtype=bool)

The question is my value=1000. The output is tf.Tensor( [[ True True True True True True] [ True True True True True True] [ True True True True True True]], shape=(3, 6), dtype=bool) Which is not what I want.

So may I know how to pass the value=1000 in the padded inputs to the embedding please?

Many thanks. Kai

Aygle commented 2 years ago

padded_inputs = tf.keras.preprocessing.sequence.pad_sequences( raw_inputs, padding="post",value=0 )