MDRNN training and GMM losses decrease abruptly to very low values, even with gradient clipping.
Was this observed in the originally tested repo, or is this result of recent PyTorch versions.
Issue persists with higher precision PyTorch configuration as well.
MDRNN training and GMM losses decrease abruptly to very low values, even with gradient clipping. Was this observed in the originally tested repo, or is this result of recent PyTorch versions. Issue persists with higher precision PyTorch configuration as well.