1) In the example of the readme file, each features has a size of 78, whereas the model expects a feature size of 102, since the default value for "aux_dim" is 24. I've changed this number, moreover the loss is computed against a new variable target of size 78, randomly generated. This addresses the issue #83.
2) Many standard deviations in the file "consts.py" look suspiciously small (10^-6). As a consequence, the uncommented NormalizedMSELoss results in NaNs or very large values (~10^22). In the original paper, they normalize the loss by dividing the outputs and targets by the standard deviation, while in our code (currently commented), we are dividing by the variance. I added a normalization flag to the loss and modified its normalization. Now the model can train with the normalized loss, although the loss remains substantially big (~10^11). This addresses the issue #29
How Has This Been Tested?
I added a new test for the normalized loss.
[x] Yes
If your changes affect data processing, have you plotted any changes? i.e. have you done a quick sanity check?
Pull Request
Description
1) In the example of the readme file, each features has a size of 78, whereas the model expects a feature size of 102, since the default value for "aux_dim" is 24. I've changed this number, moreover the loss is computed against a new variable target of size 78, randomly generated. This addresses the issue #83.
2) Many standard deviations in the file "consts.py" look suspiciously small (10^-6). As a consequence, the uncommented NormalizedMSELoss results in NaNs or very large values (~10^22). In the original paper, they normalize the loss by dividing the outputs and targets by the standard deviation, while in our code (currently commented), we are dividing by the variance. I added a normalization flag to the loss and modified its normalization. Now the model can train with the normalized loss, although the loss remains substantially big (~10^11). This addresses the issue #29
How Has This Been Tested?
I added a new test for the normalized loss.
If your changes affect data processing, have you plotted any changes? i.e. have you done a quick sanity check?
Checklist: