Previously, the random-walk noise was drawn for the full positions sequence, and for entries corresponding to the target position and potential pushforward positions the noise was set to zero. However, the noise seed resulted in different outputs with different pushforward max unroll steps.
There is a rescaling factor on the noise std being related to the sequence length (file lagrangebench/train/strats.py, line velocity_sequence_noise *= noise_std_last_step / (n_velocities**0.5)). Until now, the n_velocities for this rescaling was defined as input_seq_length + pushforward["unrolls"][-1], e.g. 6 in the case of 5 past velocities and no pushforward, and then more depending on pushforward. This caused lower noise magnitude as soon as there was pushforward.
Now, in both cases (with or without pushforward), the noise is drawn only for the past 5 historic velocities, and the noise std rescaling factor is 5 (which is the actual number of past velocities, not past positions).
-> This rescaling factor will probably mess up the baseline results from the NeurIPS paper, but it is the better way to move on.
lagrangebench/train/strats.py
, linevelocity_sequence_noise *= noise_std_last_step / (n_velocities**0.5)
). Until now, then_velocities
for this rescaling was defined asinput_seq_length + pushforward["unrolls"][-1]
, e.g. 6 in the case of 5 past velocities and no pushforward, and then more depending on pushforward. This caused lower noise magnitude as soon as there was pushforward.Now, in both cases (with or without pushforward), the noise is drawn only for the past 5 historic velocities, and the noise std rescaling factor is 5 (which is the actual number of past velocities, not past positions). -> This rescaling factor will probably mess up the baseline results from the NeurIPS paper, but it is the better way to move on.