Open albertsokol opened 2 years ago
Hi, is there any progress on this issue? As far as I'm aware, this behavior is not documented in the paper, it would be nice to understand why this choice was made in the publicly released checkpoint.
Loading the model by explicitly setting has_relative_attention_bias
and has_spatial_attention_bias
to true
leads to the following warning:
Some weights were not initialized from the model checkpoint at [local path] and are newly initialized: ['layoutlmv2.encoder.rel_pos_bias.weight', 'layoutlmv2.encoder.rel_pos_y_bias.weight', 'layoutlmv2.encoder.rel_pos_x_bias.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Suggesting that the pre-training was indeed performed without such bias
Hi all, many thanks for your great work on the LayoutXLM paper.
I understand that spatial-aware self-attention is used in this architecture, as the ablation study you performed and presented in the LayoutLMv2 paper demonstrated a good improvement in model accuracy when spatial-aware self-attention was used. However, the model that is publicly available in the Huggingface repo does not use spatial-aware self-attention: the
has_relative_attention_bias
andhas_spatial_attention_bias
flags are set tofalse
. See here for theconfig.json
file; subsequently, these lines in the code are not reached.Why was spatial-aware self-attention not used in the training of the LayoutXLM base model here? Did you find that the performance was worse when using it in the multi-lingual setting? Keen to learn more about this. Thank you.