Open korunosk opened 4 years ago
Hi,
If you want to pad the tensors, you need to explicitly give the weights (otherwise as you noticed the padded zeros are taken into account into the generated weights). For a uniform weight you can first use the same function used by the package when computing the loss, and then pad.
def get_weights(sample): # uniform distribution
if sample.dim() == 2: #
N = sample.shape[0]
return torch.ones(N).type_as(sample) / N
elif sample.dim() == 3:
B, N, _ = sample.shape
return torch.ones(B,N).type_as(sample) / N
beta = get_weights(s1)
gamma = get_weights(s2)
padding = max(s1.shape[0], s2.shape[0]) # no need to pad too much
alpha_ = get_weights(d_)
weights_ = torch.stack([ # padded weights
F.pad(beta, (0, padding - s1.shape[0])),
F.pad(gamma, (0, padding - s2.shape[0]))
])
padded_loss = loss(alpha_, d_, weights_, s_) # using tensorized backend
print("Is loss close for the 1st target?", torch.isclose(loss(d, s1), padded_loss[0], atol=1e-3).item())
print("Is loss close for the 2nd target?", torch.isclose(loss(d, s2) padded_loss[1], atol=1e-3).item())
But that can become quickly "heavy". I made a pull-request #35 to make this "user-friendlier" by giving a list of targets with or without a list of weights for the batches.
Best regards. Tanguy
Hi Tanguy,
I ended up using a similar approach as yours and the results are as expected. Anyways, thank you for you answer!
Best, Mladen
Hi @korunosk, I am having the same problem on my side. Would you mind sharing your solution? Thanks!
I am aware that Geomloss supports batching, however my distributions have different number of samples, and the only way I could think of creating the batches was to pad the tensors. But, obviously the results are incorrect this way.
Any suggestions to overcome this problem?
Self contained example: