cpuheater / pytorch_examples

Some example scripts in pytorch
26 stars 11 forks source link

Various fixes (important: making the RNN non-trivial) #3

Closed johanbluecreek closed 6 years ago

johanbluecreek commented 6 years ago

Hi,

I saw your post on HN, and been meaning to take some time to learn NNs for some time now. I found your post really useful, but I found some (one of which is quite serious) issues. Thanks for the post, and hope you find these fixes useful.

/ Johan

The first issue was that x was defined as a part of data not data_time_steps. This means during training you are training the RNN to: "Given sine-function values, return sine-function values". This means that all the RNN does it to become an identity operation. So this corrects that error.

This however makes the convergence aweful. Keeping the lr makes the loss oscillate around a value close to 2.0. Hence, it stops approaching an optimised value. So I lowered the learning rate by a factor of ten. It now converges quite well, but slow, so I also increased the epochs by a factor of ten. Now it converges on reasonable values. (and to not print too much, I changed the printing to occur 10 times less fequently as well)

Before, x had the sine-function values, so the plot worked. Since x is only suppose to take the data_time_steps, I also had to change what the "Actual" plot is (it should be plotting y, which is what is trained against). And the trucation of the sequences were off by one point in the plot, so I fixed that as well.

Lastly, forward() was defined in a confusing way, which still worked, but input was used as a global variable. It should be input that is passed to the function, not x. (As was done for the plotting, but not in the training).

cpuheater commented 6 years ago

Hi,

This time series prediction problem, given data point x at a time step t you want to predict a data point x at a time t+1, this is how time series prediction problem can be formulated in most simple case, this is not an identity function because you do not get the same value, but a value at a t+1 timestep, what you are proposing is to learn a mapping x => y, this is just a different problem.

johanbluecreek commented 6 years ago

Ah, I see. I misinterpreted the exercise. Thanks for clarifying. Closing this PR.