grammarly / gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Apache License 2.0
891 stars 216 forks source link

Question on getting different output value when running self.text_field_embedder(tokens) multiple times #157

Closed MichaelCaohn closed 2 years ago

MichaelCaohn commented 2 years ago

Hi,

I have observed a strange behavior of "encoded_text = self.text_field_embedder(tokens)" in the forward function of seq2labels_model.py (https://github.com/grammarly/gector/blob/master/gector/seq2labels_model.py#L132).

So during training, I tried to repeat the function "encoded_text = self.text_field_embedder(tokens)" three times, and print out the value of encoded_text in the following manner: image

I am expecting to get three identical output results since the input is the same and the model has not been updated yet (.step() function has not been called).

However, I am getting three different outputs for the value of encoded_text: image

This behavior exists during training under the condition that the encoder (BERT) is freezed and the encoder is not freezed.

However, I could get the three identical output results under the testing phrase.

Can anyone help to explain why this happens? Is it the correct behavior? Did something go wrong?

Thanks a lot:)

skurzhanskyi commented 2 years ago

I think the reason behind this is dropouts in the model. During the evaluation staged they are turned off

MichaelCaohn commented 2 years ago

Understood, thanks a lot:)