Open MaciejSkrabski opened 2 years ago
I noticed the same issue.
Would a simple fix be to remove the following lines in the parse_delta() function (located in input_process.py) ?? :
""" if dir_ == 'backward': masks = masks[::-1] """
And keep the following line without any change (except removing the argument "dir_='backward' " obviously) : rec['backward'] = parse_rec(values[::-1], masks[::-1], evals[::-1], eval_masks[::-1])
I encountered the same question, it seems to be wrong.
The deltas are identical in both directions
I do not know if this is intended. I do not see it justified in the paper. This means, that backward deltas do not apply to backward mask. I have noticed you have inverted masks twice for backward input processing:
So whatever bidirectional thing you calculate, the deltas do not align, so the temp decay is wrong, so the gammas... and so on. Allow me to suggest creating tests using single dimensional data.