Closed dlnlpchenliyu closed 3 years ago
Hi,
My config.json was built from the original one in Bert. And there were also not those parameters, so I don’t think there is any effect. The original one is shown in the below picture.
Wei
Hi,
My config.json was built from the original one in Bert. And there were also not those parameters, so I don’t think there is any effect. The original one is shown in the below picture.
Wei
Thank you for your replying. Your model architecture is different from the original bert, thus I think config.chunk_size_feed_forward and config.layer_norm_eps matters in your code. If you don't define them in your config.json, these variables will be null, thus the fuse_layernorm, which is constructed in Line119 of wcbert_modeling.py, will be affected. Looking forward your further reply.
Hi,
Thanks for your interesting question. The undefined variable will not be null but a default value, which you can refer to the source code of the class "Configuration". Taking "config.chunk_size_feed_forward" for example, it will be set to 0.
The "config.chunk_size_feed_forward" is set to save training memory, which will not affect the results. You can refer to the document(https://huggingface.co/transformers/main_classes/configuration.html). And the below picture is the explanation:
You can also go to read the source code of the function "apply_chunking_to_forward". Below is the difference between using chunk_size or not:
Hopes it help. Wei
wcbert_modeling.py Line 99, config.chunk_size_feed_forward is not defined in data/berts/bert/config.json Line 119, config.layer_norm_eps is not defined in data/berts/bert/config.json
Looking forward your reply