Closed hadaev8 closed 5 years ago
The "adaptive" loss (adaptive.py
) assumes that the "channels" dimension of its input is the same every time it's called, as it has a variable for each channel. If you have variable channels, then you should use the "general" form of the loss (general.py
), which requires that you manually set hyperparameters for each channel, or use a single set of manually-tuned hyperparameters for all channels. This advice assumes that your 1172 dimension is the "channels" dimension, but if 80 is the channels dimension then you should just permute your tensors to make it the last dimension, and then reshape into a matrix, and the loss will tolerate having the innermost dimension vary across calls.
Thanks
I have very vary values with zero paddings at every batch. For example, I need to calc MSE between two [32, 80, 1172] tensors in one batch and [32, 80, 1455] in another. Did I get it right this loss need the same shape during training?