I'm converting the model for deployment purpose, It's a problem that during the evaluation, the dropout layer applied. for example in MultiheadAttention, it's inherented in mmcv.cnn.bricks.transformer.py. From my perspective, dropout should be Identity during the inferecne,
Observation
I have added the check in the MultiheadAttention.forward()
if not self.training:
print(">> model in eval() mode <<")
print("self.dropout_layer: {}".format(self.dropout_layer))
The output is
>> model in eval() mode <<
self.dropout_layer: Dropout(p=0.1, inplace=False)
Hi there,
Issue
I'm converting the model for deployment purpose, It's a problem that during the evaluation, the dropout layer applied. for example in MultiheadAttention, it's inherented in mmcv.cnn.bricks.transformer.py. From my perspective, dropout should be Identity during the inferecne,
Observation
I have added the check in the MultiheadAttention.forward()
The output is
which matches the configs.
I have found the dicussion here but I don't think it helps. https://discuss.pytorch.org/t/transformer-dropout-at-inference-time/97006
Question
Do you happen to know the reason that the dropout layer is applied in the eval() mode ? I believe it happened in bevformer and other models as well.
Best, Lewis