Open eahogue opened 3 years ago
I ran into a similar problem when using immediate_compute
and check_validity
. It seems like I can fix it by simply removing the restr
argument when I call log_softmax()
. The documentation for log_softmax()
says that "All elements not included in restriction are set to negative infinity." I suspect that when you have immediate_compute
and/or check_validity
on, it might be catching these -inf
values that were put there by the restr
argument and then flagging these as problems. If this is actually what's going on, then I think this is a bug.
Like you, even with the check_validity
mode off, I am still getting nan errors later on -- I suspect those are coming from another source, which I have yet to pin down.
Immediately when I start training a model I get "NaN of Inf detected" when this line happens:
logloss = log_softmax(f_i, valid_frames)
Note this is with immediate_compute and check_validity turned on. If they aren't, then the error seems to happen a little later in the process.
In the most recent run, the values being passed to log_softmax are:
f_i = expression 1630/2 valid_frames = [204, 28]
Can someone help me understand why this input is returning either inf or nan? I've looked through the issues and it seems to be something different each time.
Here is an example of what log_softmax returns (not the same run as above though):
logloss = expression 3429/2
Thanks!