Duplicate weight initialization

JIElite commented 6 years ago

[Question1] In model.py, line 107 - 117: The following layer's weight was initalized by xavier init but at line 111 they are also initialized by weight_norm. Why do we initialize their weights twice?

ns_fc3
nsa_fc4
nsc_fc4

[Question2] In the same file, line 119: Why do we need to add this line: self.train(). According to pytorch official document, it says:

This has any effect only on modules such as Dropout or BatchNorm.

Does this mean we can remove this line?

Thank you, this repo is awesome and with nice coding structure.

wing3s commented 6 years ago

For Q1, I was following weight initialization approaches from ikostrikov/pytorch-a3c/model.py. It actually does twice weight initializations. I haven't done comparing performance between different initialization approaches. I've refactored to cleaner solution https://github.com/wing3s/pysc2-rl-mini/commit/c5551fbe26e82c50cc8ef16a7cd87f3875fa91fa without stacking initializations.

For Q2, from source code, self.train() only has impact on Dropout and BatchNorm like you mentioned. Theoretically, we can remove it. I haven't done testing removing these lines to prove it.

Thanks for digging into the code. I'll circle back to these questions once I'm in final testing phase.

JIElite commented 6 years ago

Thank you for your response :)

wing3s / pysc2-rl-mini

Duplicate weight initialization #1