Closed JIElite closed 6 years ago
For Q1, I was following weight initialization approaches from ikostrikov/pytorch-a3c/model.py. It actually does twice weight initializations. I haven't done comparing performance between different initialization approaches. I've refactored to cleaner solution https://github.com/wing3s/pysc2-rl-mini/commit/c5551fbe26e82c50cc8ef16a7cd87f3875fa91fa without stacking initializations.
For Q2, from source code, self.train()
only has impact on Dropout and BatchNorm like you mentioned. Theoretically, we can remove it. I haven't done testing removing these lines to prove it.
Thanks for digging into the code. I'll circle back to these questions once I'm in final testing phase.
Thank you for your response :)
[Question1] In model.py, line 107 - 117: The following layer's weight was initalized by xavier init but at line 111 they are also initialized by weight_norm. Why do we initialize their weights twice?
[Question2] In the same file, line 119: Why do we need to add this line:
self.train()
. According to pytorch official document, it says:Does this mean we can remove this line?
Thank you, this repo is awesome and with nice coding structure.