awslabs / gluonts

Probabilistic time series modeling in Python
https://ts.gluon.ai
Apache License 2.0
4.52k stars 744 forks source link

NBeats network shares wrong parameters between blocks in sharing mode. #629

Closed KsawerySmoczynski closed 4 years ago

KsawerySmoczynski commented 4 years ago

Setting NBeats estimator to generic mode with default parameters except for: num_blocks=[4] and sharing=True Getting model to train produces such error:

AssertionError: Cannot retrieve Parameter 'nbeatstrainingnetwork8_nbeatsgenericblock58_dense5_weight' because desired attribute does not match with stored for attribute 'shape': desired '(7, 0)' vs stored '(28, 0)'.

I can provide whole error if needed but it was quite long and I think described issue clearly below.

My forcast horizon is set to 7 timesteps as you can see and backcast is set to 4 * forecast.

I've tracked the problem and it is simply because of naming convention for layers while copying them. It is produced by lines 445-449 from _network.py and following creation of NBEATSBlocks.

With the current naming convention of layers wrong parameters are trying to be shared with the last block.

With default parameters each block has 1 input layer, 3 hidden layers, theta_backcast layer, backcast layer, theta_forecast layer and forecast layer resulting in this example in 8 layers. Layers are named "dense[0:7]". But the last block has only 6 layers because there are no theta_backcast layer and backcast layer . While copying parameters gluon is looking at the name of the parameter so in the case of 5th layer from last block (theta_forecast layer) it is trying to copy parameter from 5th layer from previous block which is in this case a theta_backcast layer.

Copying parameters by name happens in _BlockScope class create method from block.py file in gluon package.

@AaronSpieler @lostella can you take a look?

AaronSpieler commented 4 years ago

@ksawiprotocol Thanks for your bug report and insight, you caught a major bug!