In _AttentionGeneration, the two tensors attention and reduce_x are multiplied using torch.bmm, this operation failed because of unmatched matrix size. Furthermore, the code indicates an output size of (batch_size, reduced_channels, h, w) which cannot be generated by the operation given the two tensors' size.
In _AttentionGeneration, the two tensors
attention
andreduce_x
are multiplied usingtorch.bmm
, this operation failed because of unmatched matrix size. Furthermore, the code indicates an output size of(batch_size, reduced_channels, h, w)
which cannot be generated by the operation given the two tensors' size.