eliasm56 commented 3 years ago

Hello,

I am trying to implement the CRF layer into a semantic segmentation a U-Net-style convolutional neural network for semantic segmentation in order to refine the predicted masks. I have looked through many projects on GitHub for a CRF implementation and this is one of the few that seem feasible for my use. However, I have come across an error that I cannot seem to overcome.

Below is how I prepared and loaded my training images and their corresponding masks that would be fed to the model:

`# Create new arrays to store testing images/masks X_test = np.zeros((len(test_img), IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), dtype=np.float32) Y_test = np.zeros((len(test_masks), IMG_HEIGHT, IMG_WIDTH, 1), dtype=np.float32)

Resize augmented training images

for n, id_ in tqdm(enumerate(aug_img), total=len(aug_img)):

Load images

img = imread('C:/Users/manos/Desktop/unetResearch/unetproto/dataset/training/images/augmented/'+id_) 
img = resize(img, (IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), mode = 'constant', preserve_range = True)
# Save images
X[n] = img/255

Resize augmented training masks

for n, id_ in tqdm(enumerate(aug_masks), total=len(aug_masks)):

Load images

mask = imread('C:/Users/manos/Desktop/unetResearch/unetproto/dataset/training/masks/augmented/'+id_)  
mask = np.expand_dims(resize(mask, (IMG_HEIGHT, IMG_WIDTH), mode = 'constant', preserve_range = True), axis=-1)
# Save masks
Y[n] = mask/255'

And this is the convolutional neural network as I try to implement the CRF layer:

`#This is the best UNET model as of now

Encoder

sequence_input = tf.keras.layers.Input(shape=(256, 256, 3), dtype=tf.float32, name='sequence_input') sequence_mask = tf.keras.layers.Lambda(lambda x: tf.greater(x, 0))(sequence_input) c1 = tf.keras.layers.Conv2D(16, (3, 3), kernel_initializer='he_normal', padding='same')(sequence_input) n1 = tf.keras.layers.BatchNormalization()(c1) a1 = tf.keras.activations.relu(n1, alpha=0.0, max_value=None, threshold=0) c1 = tf.keras.layers.Dropout(0.1)(a1) c1 = tf.keras.layers.Conv2D(16, (3, 3), kernel_initializer='he_normal', padding='same')(c1) n1 = tf.keras.layers.BatchNormalization()(c1) a1 = tf.keras.activations.relu(n1, alpha=0.0, max_value=None, threshold=0) p1 = tf.keras.layers.MaxPooling2D((2, 2))(a1)

c2 = tf.keras.layers.Conv2D(32, (3, 3), kernel_initializer='he_normal', padding='same')(p1) n2 = tf.keras.layers.BatchNormalization()(c2) a2 = tf.keras.activations.relu(n2, alpha=0.0, max_value=None, threshold=0) c2 = tf.keras.layers.Dropout(0.1)(a2) c2 = tf.keras.layers.Conv2D(32, (3, 3), kernel_initializer='he_normal', padding='same')(c2) n2 = tf.keras.layers.BatchNormalization()(c2) a2 = tf.keras.activations.relu(n2, alpha=0.0, max_value=None, threshold=0) p2 = tf.keras.layers.MaxPooling2D((2, 2))(a2)

c3 = tf.keras.layers.Conv2D(64, (3, 3), kernel_initializer='he_normal', padding='same')(p2) n3 = tf.keras.layers.BatchNormalization()(c3) a3 = tf.keras.activations.relu(n3, alpha=0.0, max_value=None, threshold=0) c3 = tf.keras.layers.Dropout(0.2)(a3) c3 = tf.keras.layers.Conv2D(64, (3, 3), kernel_initializer='he_normal', padding='same')(c3) n3 = tf.keras.layers.BatchNormalization()(c3) a3 = tf.keras.activations.relu(n3, alpha=0.0, max_value=None, threshold=0) p3 = tf.keras.layers.MaxPooling2D((2, 2))(a3)

c4 = tf.keras.layers.Conv2D(128, (3, 3), kernel_initializer='he_normal', padding='same')(p3) n4 = tf.keras.layers.BatchNormalization()(c4) a4 = tf.keras.activations.relu(n4, alpha=0.0, max_value=None, threshold=0) c4 = tf.keras.layers.Dropout(0.2)(a4) c4 = tf.keras.layers.Conv2D(128, (3, 3), kernel_initializer='he_normal', padding='same')(c4) n4 = tf.keras.layers.BatchNormalization()(c4) a4 = tf.keras.activations.relu(n4, alpha=0.0, max_value=None, threshold=0) p4 = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(a4)

c5 = tf.keras.layers.Conv2D(256, (3, 3), kernel_initializer='he_normal', padding='same')(p4) n5 = tf.keras.layers.BatchNormalization()(c5) a5 = tf.keras.activations.relu(n5, alpha=0.0, max_value=None, threshold=0) c5 = tf.keras.layers.Dropout(0.3)(a5) c5 = tf.keras.layers.Conv2D(256, (3, 3), kernel_initializer='he_normal', padding='same')(c5) n5 = tf.keras.layers.BatchNormalization()(c5) a5 = tf.keras.activations.relu(n5, alpha=0.0, max_value=None, threshold=0)

Decoder

u6 = tf.keras.layers.Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(a5) u6 = tf.keras.layers.concatenate([u6, a4]) c6 = tf.keras.layers.Conv2D(128, (3, 3), kernel_initializer='he_normal', padding='same')(u6) n6 = tf.keras.layers.BatchNormalization()(c6) a6 = tf.keras.activations.relu(n6, alpha=0.0, max_value=None, threshold=0) c6 = tf.keras.layers.Dropout(0.2)(a6) c6 = tf.keras.layers.Conv2D(128, (3, 3), kernel_initializer='he_normal', padding='same')(c6) n6 = tf.keras.layers.BatchNormalization()(c6) a6 = tf.keras.activations.relu(n6, alpha=0.0, max_value=None, threshold=0)

u7 = tf.keras.layers.Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(a6) u7 = tf.keras.layers.concatenate([u7, a3]) c7 = tf.keras.layers.Conv2D(64, (3, 3), kernel_initializer='he_normal', padding='same')(u7) n7 = tf.keras.layers.BatchNormalization()(c7) a7 = tf.keras.activations.relu(n7, alpha=0.0, max_value=None, threshold=0) c7 = tf.keras.layers.Dropout(0.2)(a7) c7 = tf.keras.layers.Conv2D(64, (3, 3), kernel_initializer='he_normal', padding='same')(c7) n7 = tf.keras.layers.BatchNormalization()(c7) a7 = tf.keras.activations.relu(n7, alpha=0.0, max_value=None, threshold=0)

u8 = tf.keras.layers.Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same')(a7) u8 = tf.keras.layers.concatenate([u8, a2]) c8 = tf.keras.layers.Conv2D(32, (3, 3), kernel_initializer='he_normal', padding='same')(u8) n8 = tf.keras.layers.BatchNormalization()(c8) a8 = tf.keras.activations.relu(n8, alpha=0.0, max_value=None, threshold=0) c8 = tf.keras.layers.Dropout(0.1)(a8) c8 = tf.keras.layers.Conv2D(32, (3, 3), kernel_initializer='he_normal', padding='same')(c8) n8 = tf.keras.layers.BatchNormalization()(c8) a8 = tf.keras.activations.relu(n8, alpha=0.0, max_value=None, threshold=0)

u9 = tf.keras.layers.Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same')(a8) u9 = tf.keras.layers.concatenate([u9, a1], axis=3) c9 = tf.keras.layers.Conv2D(16, (3, 3), kernel_initializer='he_normal', padding='same')(u9) n9 = tf.keras.layers.BatchNormalization()(c9) a9 = tf.keras.activations.relu(n9, alpha=0.0, max_value=None, threshold=0) c9 = tf.keras.layers.Dropout(0.1)(a9) c9 = tf.keras.layers.Conv2D(16, (3, 3), kernel_initializer='he_normal', padding='same')(c9) n9 = tf.keras.layers.BatchNormalization()(c9) a9 = tf.keras.activations.relu(n9, alpha=0.0, max_value=None, threshold=0)

crf = CRF(2) outputs = crf(a9, mask=sequence_mask)`

But I receive an error: "ValueError: Input mask to CRF must have dim 2 if not None"

Can someone please help me figure out a solution?

luozhouyang commented 3 years ago

The sequence_mask has dim 3 (256, 256, 3). However, input mask to CRF must have dim 2. And the inputs to CRF must have rank 3, but your inputs have rank 4.

ktakanopy commented 3 years ago

@luozhouyang can you deploy a functional example ?

UrszulaCzerwinska commented 3 years ago

@luozhouyang I am working with sequence classification (same structure as NER), my sequence mask has 3 dim exactly like my Input (#batches, #words, #dims) how can I reduce it to 2d to fit your crf ? the input to crf has 3 dims (#batches, #words, #classes).

UrszulaCzerwinska commented 3 years ago

maybe the summary can help

Model: "model_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_10 (InputLayer)        [(None, 10, 100)]         0         
_________________________________________________________________
time_distributed_8 (TimeDist (None, 10, 516)           2300680   
_________________________________________________________________
bidirectional_9 (Bidirection (None, 10, 516)           1599600   
_________________________________________________________________
time_distributed_9 (TimeDist (None, 10, 6)             3102      
_________________________________________________________________
masking_3 (Masking)          (None, 10, 6)             0         
=================================================================
Total params: 3,903,382
Trainable params: 3,903,382
Non-trainable params: 0

luozhouyang commented 3 years ago

hey guys, sorry for taking so long to reply to you. I refactored the code, the error you mentioned above is solved. You can found example in README.

UrszulaCzerwinska commented 3 years ago

Hi, I just tested the updated version, I am getting now

 ValueError: Shape must be rank 2 but is rank 3 for '{{node cond/Slice}} = Slice[Index=DT_INT32, T=DT_INT32](cond/add_1/Cast, cond/Slice/begin, cond/Slice/size)' with input shapes: [?,10,6], [2], [2].

luozhouyang commented 3 years ago

Can you paste your code that build the Model?

UrszulaCzerwinska commented 3 years ago

embedding_layer = Embedding(len(word_index) + 1, embedding_dim, weights=[embedding_matrix],
                            input_length=maxlen, trainable=False)

sentence_input = Input(shape=(maxlen,), dtype='int32')
embedded_sequences = embedding_layer(sentence_input)
lstm_word = Bidirectional(GRU(258, return_sequences=True))(embedded_sequences)
attn_word = HierarchicalAttentionNetwork(100)(lstm_word)
sentenceEncoder = Model(sentence_input, attn_word)

review_input = Input(shape=(max_sentences, maxlen), dtype='int32')

review_encoder = TimeDistributed(sentenceEncoder)(review_input)
lstm_sentence = Bidirectional(LSTM(258, return_sequences=True))(review_encoder)

final_dense = TimeDistributed(Dense(n_tags, activation="relu"))(lstm_sentence)

base = Model(review_input, final_dense)
model = CRFModel(base, n_tags)

# no need to specify a loss for CRFModel, model will compute crf loss by itself
model.compile(
    optimizer=tf.keras.optimizers.Adam(3e-4),
    metrics=['acc'],
    )

UrszulaCzerwinska commented 3 years ago

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_8 (InputLayer)         [(None, 10, 100)]         0         
_________________________________________________________________
time_distributed_5 (TimeDist (None, 10, 516)           2300680   
_________________________________________________________________
bidirectional_7 (Bidirection (None, 10, 516)           1599600   
_________________________________________________________________
time_distributed_6 (TimeDist (None, 10, 6)             3102      
=================================================================
Total params: 3,903,382
Trainable params: 2,211,782
Non-trainable params: 1,691,600
_____________________________________________________________

luozhouyang commented 3 years ago

Can you put the reproducible code in Colab, and paste the share link here?

UrszulaCzerwinska commented 3 years ago

https://colab.research.google.com/drive/1xKH0sRljeoRnyf16_lODQGHsIALHfER0

Please download the data from GitHub link mentioned in notebook

luozhouyang commented 3 years ago

labels are not need to do one-hot encoding

UrszulaCzerwinska commented 3 years ago

ok, I will try without one hot encoding

pt., 7 maj 2021 o 10:18 luozhouyang @.***> napisał(a):

labels are not need to one-hot encoding

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/luozhouyang/keras-crf/issues/1#issuecomment-834163171, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA35VLXQWHD3AQ5TZ7LGBL3TMOO4BANCNFSM4XXTNZIQ .

--

URSZULA CZERWINSKA

Data Scientist, Ph. D.

http://urszulaczerwinska.github.io/about/

luozhouyang commented 3 years ago

Just change these lines:

# labels = [to_categorical(list(lab), num_classes=6) for lab in new_labs]
labels = new_labs
print(labels[0])
print(labels[0:3])
labels[0]
labels = np.array(labels)

And you can get training results like this:

UrszulaCzerwinska commented 3 years ago

I adapted the code as you say.

I added zero_mask=True to mask the padding.

Why there are predictions for the "padded" values in Y?

For instance:

y_test[0] = array([1, 1, 3, 3, 3, 4, 0, 0, 0, 0], dtype=int32)
y_pred[0] = array([1, 1, 1, 2, 3, 4, 3, 3, 3, 3])

Is that expected behavior?

Do you evaluate these predictions in the accuracy ? When I don't consider the predictions for 0, the score is better.

https://colab.research.google.com/drive/1xKH0sRljeoRnyf16_lODQGHsIALHfER0?usp=sharing

Just change these lines:

# labels = [to_categorical(list(lab), num_classes=6) for lab in new_labs]
labels = new_labs
print(labels[0])
print(labels[0:3])
labels[0]
labels = np.array(labels)

And you can get training results like this:

luozhouyang / keras-crf

How to fix "ValueError: Input mask to CRF must have dim 2 if not None"? #1

Resize augmented training images

Load images

Resize augmented training masks

Load images

Encoder

Decoder