Closed SHIELD-SKY closed 3 years ago
Sorry for the delay in responding. The problem is that you only have one sample per class. In order to compute per-label means and covariances, the data needs to have at least one (or more, if min_labelcount>2
) samples per class.
This works for me:
import torch
import numpy as np
from torch.utils.data import TensorDataset
from otdd.pytorch.distance import DatasetDistance
def dataset_from_numpy(X, Y, classes = None):
targets = torch.LongTensor(list(Y))
ds = TensorDataset(torch.from_numpy(X).type(torch.FloatTensor),targets)
ds.targets = targets
ds.classes = classes if classes is not None else [i for i in range(len(np.unique(Y)))]
return ds
samples = 100
dim = 6
x1 = np.random.randn(samples, dim)
y1 = np.random.randint(0, 2, size=(samples))
x2 = np.random.randn(samples, dim)
y2 = np.random.randint(0, 2, size=(samples))
ds1 = dataset_from_numpy(x1,y1)
ds2 = dataset_from_numpy(x2,y2)
dist = DatasetDistance(ds1,ds2)
dist.distance()
I see a function called "dataset_from_numpy"
I want read some data from CSV files, then calculating dataset distances .
` import torch import numpy as np from torch.utils.data import TensorDataset from otdd.pytorch.distance import DatasetDistance
def dataset_from_numpy(X, Y, classes = None): targets = torch.LongTensor(list(Y)) ds = TensorDataset(torch.from_numpy(X).type(torch.FloatTensor),targets) ds.targets = targets ds.classes = classes if classes is not None else [i for i in range(len(np.unique(Y)))] return ds
x1 = np.array([[1,2,3,5,6,9],[4,5,6,2,4,5]]) y1 = np.array([0,1])
x2 = np.array([[2,2,3,5,3,7],[4,5,6,9,1,4]]) y2 = np.array([1,2])
ds1 = dataset_from_numpy(x1,y1) ds2 = dataset_from_numpy(x2,y2) dist = DatasetDistance(ds1,ds2) dist.distance() `
and i face a problem:
` Traceback (most recent call last): File "", line 1, in
File "/home/xxx/otdd/otdd/pytorch/distance.py", line 595, in distance
_ = self._get_label_distances()
File "/home/xxx/otdd/otdd/pytorch/distance.py", line 439, in _get_label_distances
Means, Covs = self._get_label_stats()
File "/home/xxx/otdd/otdd/pytorch/distance.py", line 385, in _get_label_stats
**shared_args)
File "/home/xxx/otdd/otdd/pytorch/moments.py", line 321, in compute_label_stats
M = torch.stack([μ.to(device) for i,μ in sorted(M.items()) if μ is not None], dim=0)
RuntimeError: stack expects a non-empty TensorList
`
Could you please help me how to solve it?