Hi, thanks for this fantastic work. In the paper, you said "The methods we describe here extend easily to the M = N case
because DGCNN, Transformer, and Softmax treat inputs as unordered sets. None requires X and Y to have the same length or a bijective matching."
Is there any implementation of the cases when M differs from N? Thanks.
Usually what is done in the N x M case is to pad entries to a common value (example K > M > N) and then use the mask component of the Attention Module to neglect the effect of padded data.
Hi, thanks for this fantastic work. In the paper, you said "The methods we describe here extend easily to the M = N case because DGCNN, Transformer, and Softmax treat inputs as unordered sets. None requires X and Y to have the same length or a bijective matching."
Is there any implementation of the cases when M differs from N? Thanks.