liuwei1206 / LEBERT

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"
336 stars 60 forks source link

Some undefined variables #11

Closed dlnlpchenliyu closed 3 years ago

dlnlpchenliyu commented 3 years ago

wcbert_modeling.py Line 99, config.chunk_size_feed_forward is not defined in data/berts/bert/config.json Line 119, config.layer_norm_eps is not defined in data/berts/bert/config.json

Looking forward your reply

liuwei1206 commented 3 years ago

Hi,

My config.json was built from the original one in Bert. And there were also not those parameters, so I don’t think there is any effect. The original one is shown in the below picture.

image

Wei

dlnlpchenliyu commented 3 years ago

Hi,

My config.json was built from the original one in Bert. And there were also not those parameters, so I don’t think there is any effect. The original one is shown in the below picture.

image

Wei

Thank you for your replying. Your model architecture is different from the original bert, thus I think config.chunk_size_feed_forward and config.layer_norm_eps matters in your code. If you don't define them in your config.json, these variables will be null, thus the fuse_layernorm, which is constructed in Line119 of wcbert_modeling.py, will be affected. Looking forward your further reply.

liuwei1206 commented 3 years ago

Hi,

Thanks for your interesting question. The undefined variable will not be null but a default value, which you can refer to the source code of the class "Configuration". Taking "config.chunk_size_feed_forward" for example, it will be set to 0.

The "config.chunk_size_feed_forward" is set to save training memory, which will not affect the results. You can refer to the document(https://huggingface.co/transformers/main_classes/configuration.html). And the below picture is the explanation: image

You can also go to read the source code of the function "apply_chunking_to_forward". Below is the difference between using chunk_size or not: image

Hopes it help. Wei