clovaai / spade

Apache License 2.0
81 stars 20 forks source link

Updating weight name missmatch #13

Closed thomassajot closed 2 years ago

thomassajot commented 2 years ago

Thank you for sharing the code !

I am experiencing some issues with the model weight update:

The weight of the pretraiend model bert-base-multilingual-cased
The # of weights 1.78e+08
pretrained bert-base-multilingual-cased is used
!!!!embeddings.position_ids model param. is not presented in child model!!!!
embeddings.word_embeddings.weight updated
!!!!embeddings.position_embeddings.weight model param. is not presented in child model!!!!
embeddings.token_type_embeddings.weight updated
embeddings.LayerNorm.weight updated
embeddings.LayerNorm.bias updated
encoder.layer.0.attention.self.query.weight updated
encoder.layer.0.attention.self.query.bias updated
encoder.layer.0.attention.self.key.weight updated
encoder.layer.0.attention.self.key.bias updated
encoder.layer.0.attention.self.value.weight updated
encoder.layer.0.attention.self.value.bias updated
encoder.layer.0.attention.output.dense.weight updated
encoder.layer.0.attention.output.dense.bias updated
encoder.layer.0.attention.output.LayerNorm.weight updated
encoder.layer.0.attention.output.LayerNorm.bias updated
encoder.layer.0.intermediate.dense.weight updated
encoder.layer.0.intermediate.dense.bias updated
encoder.layer.0.output.dense.weight updated
encoder.layer.0.output.dense.bias updated
encoder.layer.0.output.LayerNorm.weight updated
encoder.layer.0.output.LayerNorm.bias updated
encoder.layer.1.attention.self.query.weight updated
encoder.layer.1.attention.self.query.bias updated
encoder.layer.1.attention.self.key.weight updated
encoder.layer.1.attention.self.key.bias updated
encoder.layer.1.attention.self.value.weight updated
encoder.layer.1.attention.self.value.bias updated
encoder.layer.1.attention.output.dense.weight updated
encoder.layer.1.attention.output.dense.bias updated
encoder.layer.1.attention.output.LayerNorm.weight updated
encoder.layer.1.attention.output.LayerNorm.bias updated
encoder.layer.1.intermediate.dense.weight updated
encoder.layer.1.intermediate.dense.bias updated
encoder.layer.1.output.dense.weight updated
encoder.layer.1.output.dense.bias updated
encoder.layer.1.output.LayerNorm.weight updated
encoder.layer.1.output.LayerNorm.bias updated
encoder.layer.2.attention.self.query.weight updated
encoder.layer.2.attention.self.query.bias updated
encoder.layer.2.attention.self.key.weight updated
encoder.layer.2.attention.self.key.bias updated
encoder.layer.2.attention.self.value.weight updated
encoder.layer.2.attention.self.value.bias updated
encoder.layer.2.attention.output.dense.weight updated
encoder.layer.2.attention.output.dense.bias updated
encoder.layer.2.attention.output.LayerNorm.weight updated
encoder.layer.2.attention.output.LayerNorm.bias updated
encoder.layer.2.intermediate.dense.weight updated
encoder.layer.2.intermediate.dense.bias updated
encoder.layer.2.output.dense.weight updated
encoder.layer.2.output.dense.bias updated
encoder.layer.2.output.LayerNorm.weight updated
encoder.layer.2.output.LayerNorm.bias updated
encoder.layer.3.attention.self.query.weight updated
encoder.layer.3.attention.self.query.bias updated
encoder.layer.3.attention.self.key.weight updated
encoder.layer.3.attention.self.key.bias updated
encoder.layer.3.attention.self.value.weight updated
encoder.layer.3.attention.self.value.bias updated
encoder.layer.3.attention.output.dense.weight updated
encoder.layer.3.attention.output.dense.bias updated
encoder.layer.3.attention.output.LayerNorm.weight updated
encoder.layer.3.attention.output.LayerNorm.bias updated
encoder.layer.3.intermediate.dense.weight updated
encoder.layer.3.intermediate.dense.bias updated
encoder.layer.3.output.dense.weight updated
encoder.layer.3.output.dense.bias updated
encoder.layer.3.output.LayerNorm.weight updated
encoder.layer.3.output.LayerNorm.bias updated
encoder.layer.4.attention.self.query.weight updated
encoder.layer.4.attention.self.query.bias updated
encoder.layer.4.attention.self.key.weight updated
encoder.layer.4.attention.self.key.bias updated
encoder.layer.4.attention.self.value.weight updated
encoder.layer.4.attention.self.value.bias updated
encoder.layer.4.attention.output.dense.weight updated
encoder.layer.4.attention.output.dense.bias updated
encoder.layer.4.attention.output.LayerNorm.weight updated
encoder.layer.4.attention.output.LayerNorm.bias updated
encoder.layer.4.intermediate.dense.weight updated
encoder.layer.4.intermediate.dense.bias updated
encoder.layer.4.output.dense.weight updated
encoder.layer.4.output.dense.bias updated
encoder.layer.4.output.LayerNorm.weight updated
encoder.layer.4.output.LayerNorm.bias updated
!!!!encoder.layer.5.attention.self.query.weight model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.self.query.bias model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.self.key.weight model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.self.key.bias model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.self.value.weight model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.self.value.bias model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.5.attention.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.5.intermediate.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.5.intermediate.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.5.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.5.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.5.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.5.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.self.query.weight model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.self.query.bias model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.self.key.weight model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.self.key.bias model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.self.value.weight model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.self.value.bias model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.6.attention.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.6.intermediate.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.6.intermediate.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.6.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.6.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.6.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.6.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.self.query.weight model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.self.query.bias model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.self.key.weight model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.self.key.bias model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.self.value.weight model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.self.value.bias model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.7.attention.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.7.intermediate.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.7.intermediate.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.7.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.7.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.7.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.7.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.self.query.weight model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.self.query.bias model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.self.key.weight model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.self.key.bias model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.self.value.weight model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.self.value.bias model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.8.attention.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.8.intermediate.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.8.intermediate.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.8.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.8.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.8.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.8.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.self.query.weight model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.self.query.bias model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.self.key.weight model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.self.key.bias model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.self.value.weight model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.self.value.bias model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.9.attention.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.9.intermediate.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.9.intermediate.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.9.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.9.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.9.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.9.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.self.query.weight model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.self.query.bias model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.self.key.weight model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.self.key.bias model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.self.value.weight model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.self.value.bias model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.10.attention.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.10.intermediate.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.10.intermediate.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.10.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.10.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.10.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.10.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.self.query.weight model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.self.query.bias model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.self.key.weight model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.self.key.bias model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.self.value.weight model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.self.value.bias model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.11.attention.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!encoder.layer.11.intermediate.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.11.intermediate.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.11.output.dense.weight model param. is not presented in child model!!!!
!!!!encoder.layer.11.output.dense.bias model param. is not presented in child model!!!!
!!!!encoder.layer.11.output.LayerNorm.weight model param. is not presented in child model!!!!
!!!!encoder.layer.11.output.LayerNorm.bias model param. is not presented in child model!!!!
!!!!pooler.dense.weight model param. is not presented in child model!!!!
!!!!pooler.dense.bias model param. is not presented in child model!!!!

Would you have any advice on how to resolve this issue ?

I am using the following conda env:

munch                     2.5.0                      py_0    conda-forge
nltk                      3.6.7              pyhd8ed1ab_0    conda-forge
numpy                     1.21.5           py37hf2998dd_0    conda-forge
opencv-python-headless    4.5.2.54                 pypi_0    pypi
pytest                    6.2.5            py37h89c1867_1    conda-forge
python                    3.7.12          hf930737_100_cpython    conda-forge
pytorch                   1.8.1           py3.7_cuda11.1_cudnn8.0.5_0    pytorch
pytorch-lightning         1.3.8              pyhd8ed1ab_0    conda-forge
tensorboard               2.4.1              pyhd8ed1ab_1    conda-forge
tensorboard-plugin-wit    1.8.1              pyhd8ed1ab_0    conda-forge
tokenizers                0.10.3           py37hcb7a40c_1    conda-forge
torchmetrics              0.3.2              pyhd8ed1ab_0    conda-forge
torchvision               0.9.1                py37_cu111    pytorch
transformers              4.5.1              pyhd8ed1ab_1    conda-forge
whwang299 commented 2 years ago

Hi @thomassajot This is normal. embeddings.position_ids, embeddings.position_embeddings are to give information about word order. spade does not use it and instead use xy-coordinate of each word on the image (it is a serializer-free) However if you want to use it, you can simply set

https://github.com/clovaai/spade/blob/a85574ceaa00f1878a23754f283aa66bc2daf082/configs/cord.1.5layers.train.yaml#L18

to

- seqPos

Also, for the absence of layer5--layer11, this is due to the use of small model that. You can set

https://github.com/clovaai/spade/blob/a85574ceaa00f1878a23754f283aa66bc2daf082/configs/cord.1.5layers.train.yaml#L29

to

encoder_config_name: bert-base-multilingual-cased-12layers

You may need to make new config under data/model/backbones/bert-base-multilingual-cased-12layers based on data/model/backbones/bert-base-multilingual-cased-5layers (simply copy and change the number of layers in the config file)` Happy coding!

Wonseok

thomassajot commented 2 years ago

Great. Thank you for the quick reply.

I will code happily.