Open y-yang42 opened 2 years ago
I'm facing the same issue - have you reached any conclusions, why is this happening?
You can try gradient clipping it fixed it in my case
exact same issue. resolved by using gradient clipping. thanks @15m43lk4155y
exact same issue. gradient clipping does not work. I have tried gradient_clip_val=0.1,0.5,0.6,1.0.
do you mean the this gradient clipping? @15m43lk4155y
trainer = pl.Trainer(
gpus=[0] if torch.cuda.is_available() else None,
max_epochs=max_epochs,
gradient_clip_val=0.1, <-----this arg ?
callbacks=[early_stop_callback, model_checkpt],
log_every_n_steps=50)
Expected behavior
I try to run DeepAR model on my data set.
Actual behavior
Get ValueError and nan in RNN weights and output. A typical example is a time series that is all zero except a really large spike at a given point as shown below.
It seems to do with the
max_encoder_length
ormax_prediction_length
as changing such values will influence whether there is a error or not. Also using differenttarget_normalizer
will influence whether there is a error as well.Code to reproduce the problem
Traceback: