dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.56k stars 538 forks source link

[BUGFIX] Remove mutable_args restriction in get_model API #1207

Closed ma-hei closed 4 years ago

ma-hei commented 4 years ago

Description

This commit removes the mutable_args restriction. After this commit, any parameter can be overriden, as long as the user sets the allow_override parameter to True.

Checklist

Essentials

cc @dmlc/gluon-nlp-team

ma-hei commented 4 years ago

@eric-haibin-lin When creating this PR, I noticed that a fix to remove mutable_args has already been implemented for some of the BERT models. In bug description https://github.com/dmlc/gluon-nlp/issues/1129 you described that this should be fixed for GPT, Language model and transformer as well. I assume this is still requested, so I created this PR. I noticed that the commit checklist requires the change to be tested and I found existing tests for hparam_allow_override for BERT models. I'd like to confirm that this fix is still required first and would like to add the tests as well if thats the case. If anything else is missing, let me know.

chenw23 commented 4 years ago

Since v0.9.x branch is our release branch, would you please rebase your pull request on top of master branch, which is our develop branch? Thanks! Also, the CI is temporarily failing currently, which should be fixed soon. Thanks for your patience.

ma-hei commented 4 years ago

Makes sense! Rebased on master.

ma-hei commented 4 years ago

@eric-haibin-lin There was an issue in my previous fix. I fixed this now. I believe this was actually an existing bug: In _get_gpt2_model (in scripts/text_generation/model/gpt.py) we were previously passing kwargs along with predefined_args to the constructor of GPT2Model. This constructor would then complain about the presence of duplicate arguments (because an argument could exist in both predefined_args and kwargs). If I see this correctly, then it was previously not possible to overwrite the mutuable argument "dropout" (I confirmed this and it seems to be the case). I manually tested the get_model API of all other models and verified that the hparam_allow_override parameter works.

chenw23 commented 4 years ago

Hello, there is a recent CI change on the master branch. #1208 Would you please merge the latest master branch to get this update? Failure to keep up with this update will prevent CI from successful builds. Thanks!

mli commented 4 years ago

Job PR-1207/9 is complete. Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1207/9/index.html

codecov[bot] commented 4 years ago

Codecov Report

Merging #1207 into master will increase coverage by 0.03%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1207      +/-   ##
==========================================
+ Coverage   87.42%   87.45%   +0.03%     
==========================================
  Files          81       81              
  Lines        7347     7365      +18     
==========================================
+ Hits         6423     6441      +18     
  Misses        924      924              
Impacted Files Coverage Δ
src/gluonnlp/model/bert.py 94.65% <100.00%> (+0.01%) :arrow_up:
src/gluonnlp/model/language_model.py 98.64% <100.00%> (+0.15%) :arrow_up:
src/gluonnlp/model/transformer.py 91.71% <100.00%> (+0.05%) :arrow_up:
mli commented 4 years ago

Job PR-1207/12 is complete. Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1207/12/index.html

ma-hei commented 4 years ago

@eric-haibin-lin I added tests for all models where the new parameter hparam_allow_override was added. For each model, the test iterates over all parameters of the model and asserts that get_model throws an exception if the parameter is non mutable and hparam_allow_override is set to false. The test also asserts that get_model succeeds if hparam_allow_override is true or the parameter is mutable by default. In language_model.py I had to factor the parameters out of the model constructors as it is already done for bert models and gpt2. This allowed me to access the parameters in the test method. Let me know if you want me to add/change anything.

eric-haibin-lin commented 4 years ago

Thanks for hte contribution!