Closed weizhiting closed 4 years ago
In your dataloader, are you using a random sampler, or something like MPerClassSampler?
Yes, I use MperClassSampler. Below is my main code, after getting the embeddings, for each Xte_tran sample, i use cosine similarity to find the Xtr_tran that most similar to the Xte_tran.
class Setting:
"""Parameters for training"""
def __init__(self, nclass):
self.epoch = 300
self.lr = 1e-5 * 4
self.doPCA = True
self.out_sz= 100
self.nPCA = 1000
self.m = 4
self.batch_size = self.m * 64
self.emb_szs=[500, 200]
self.ps = 0.25
self.use_bn = True
self.actn = nn.ReLU()
class EmbeddingNet(nn.Module):
def __init__(self, in_sz, out_sz, emb_szs, ps, use_bn=True, actn=nn.ReLU()):
super(EmbeddingNet, self).__init__()
self.in_sz = in_sz
self.out_sz = out_sz
self.n_embs = len(emb_szs) - 1
ps = np.repeat(ps, self.n_embs)
# input layer
layers = [nn.Linear(self.in_sz, emb_szs[0]),
actn]
for i in range(self.n_embs):
layers += self.bn_drop_lin(n_in=emb_szs[i], n_out=emb_szs[i+1], bn=use_bn, p=ps[i], actn=actn)
layers.append(nn.Linear(emb_szs[-1], self.out_sz))
self.fc = nn.Sequential(*layers)
def bn_drop_lin(self, n_in:int, n_out:int, bn:bool=True, p:float=0., actn:nn.Module=None):
layers = [nn.BatchNorm1d(n_in)] if bn else []
if p != 0: layers.append(nn.Dropout(p))
layers.append(nn.Linear(n_in, n_out))
if actn is not None: layers.append(actn)
return layers
def forward(self, x):
output = self.fc(x)
return output
class BasicDataset(Dataset):
def __init__(self, data, labels):
self.data = torch.from_numpy(data).float()
self.labels = labels
def __getitem__(self, index):
return self.data[index], self.labels[index]
def __len__(self):
return len(self.data)
def main():
train_dataset = BasicDataset(Xtr, ytr)
test_dataset = BasicDataset(Xte, yte)
model = EmbeddingNet(in_sz=Xtr_pca.shape[1], out_sz=args.out_sz, emb_szs=args.emb_szs,
ps=args.ps, use_bn=args.use_bn, actn=args.actn)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
model_optimizer = torch.optim.Adam(model.parameters(), lr=args.lr, weight_decay=0.0001)
miner = miners.MultiSimilarityMiner(epsilon=0.1)
loss = losses.TripletMarginLoss(margin=0.1, distance = CosineSimilarity())
loss = losses.CrossBatchMemory(loss= loss, embedding_size= args.out_sz, miner=miner)
sampler = samplers.MPerClassSampler(ytr, m=args.m, length_before_new_iter=len(train_dataset))
models = {"trunk": model}
optimizers = {"trunk_optimizer": model_optimizer}
loss_funcs = {"metric_loss": loss}
mining_funcs = {"tu record_keeper, _, _ = logging_presets.get_record_keeper(logs, tensorboard)
hooks = logging_presets.get_hook_container(record_keeper)
dataset_dict = {"train": train_dataset, "val": test_dataset}
tester = testers.GlobalEmbeddingSpaceTester(end_of_testing_hook = hooks.end_of_testing_hook,
dataloader_num_workers = 32, batch_size= args.batch_size, use_trunk_output=True,
reference_set = 'compared_to_training_set', normalize_embeddings = False)
end_of_epoch_hook = hooks.end_of_epoch_hook(tester, dataset_dict, model_folder, test_interval = 10,
patience = 10)
trainer = trainers.MetricLossOnly(models, optimizers, args.batch_size, loss_funcs, mining_funcs,
train_dataset, sampler=sampler, dataloader_num_workers = 32)
trainer = trainers.MetricLossOnly(models, optimizers, args.batch_size, loss_funcs, mining_funcs,
train_dataset, sampler=sampler, dataloader_num_workers = 32,
end_of_iteration_hook = hooks.end_of_iteration_hook,
end_of_epoch_hook = end_of_epoch_hook)
trainer.train(num_epochs= args.epoch)ple_miner": miner}
Xtr_trans, _ = tester.get_all_embeddings(train_dataset, model)
Xte_trans, _ = tester.get_all_embeddings(test_dataset, model)
Have you tried an experiment without CrossBatchMemory and MultiSimilarityMiner? As seen in this other issue, getting CrossBatchMemory to work well requires some tuning.
Also in my experience, ContrastiveLoss works better than TripletMarginLoss. Just make sure to change the default margins if you're going to use CosineSimilarity()
Yeah, Indeed, I has tried without CrossBatchMemory and MultiSimilarityMiner. Furthermore, other losses such as ConstastiveLoss have been tried. But the preformance only changes a little bit.
In tester.get_all_embeddings
, are you using the best performing model?
Also, since you're dealing with an unbalanced dataset, it will probably help to use a different accuracy calculator:
from pytorch_metric_learning.utils.accuracy_calculator import AccuracyCalculator
accuracy_calculator = AccuracyCalculator(avg_of_avgs=True)
tester = testers.GlobalEmbeddingSpaceTester(accuracy_calculator=accuracy_calculator)
This will compute the average of per-class accuracies, so a class with 5 samples is just as important as a class with 50 samples. The way accuracy is computed is important because the end_of_epoch_hook
uses accuracy to determine when training should end.
As you suggested, i used the best performing model and AccuracyCalculator(avg_of_avgs=True), but the performance also only changes a little bit. Therefore, were there not enough samples per class for training in my porject? How to determine the actual reasons? Thanks for your patience.
There isn't a clear rule about how many samples per class are necessary. Other factors affect the number of necessary training samples, like the similarity between the training and test sets, and the similarity between different classes. I suggest you look at some other aspects of training like:
I keep two samples per class in the test sets and all the other samples per class in the training sets. Some classes are indeed very similar in my project, may be this is the reason why the performance is low. (1) as i am a newer in this field, i do not know how to change the architecture of my model. Add more layers? Add more neurons per layer? (2) Every sample has thousands features, without reducing the dimension, the performance is extremely poor. (3) i has tried only training and testing on just the small sample classes, and the performance is very low. If training and testing on just the large sample classes, the performance is satisfactory. Maybe i need do more experiment on my datasets, and findout the reasons.
Yeah, if some classes are really similar, then having only a few samples will make it extra difficult. Perhaps you could apply more data augmentations during training.
As a final suggestion, you could read up on few-shot learning. Here are a couple of repos that might help:
Good luck!
Thanks for your good advice. One more question, can I implement multi-label metric learning with this awesome library, such as https://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/tkdd10mddm.pdf.
That particular method hasn't been implemented. However, you can train and test on multi-label datasets. By multi-label, I mean that instead of an element's label being a single number, it is instead a list of numbers or a numpy array of numbers. So in other words, a batch of 32 elements with 2 labels each will have labels with shape (32, 2).
To train on multi-labels you need to write a custom trainer. Here's a simple one that applies metric_loss
to each label level. You can use this trainer exactly like MetricLossOnly
:
from pytorch_metric_learning.trainers import MetricLossOnly
class MultiLabelTrainer(MetricLossOnly):
def calculate_loss(self, curr_batch):
data, labels = curr_batch
embeddings = self.compute_embeddings(data)
for i in range(labels.size(1)):
curr_labels = labels[:, i]
indices_tuple = self.maybe_mine_embeddings(embeddings, curr_labels)
self.losses["metric_loss"] += self.maybe_get_metric_loss(embeddings, curr_labels, indices_tuple)
self.losses["metric_loss"] /= labels.size(1)
You also need to pass in label_hierarchy_level
to the trainer upon initialization. There are 3 ways to use this argument:
i
. This will select the ith column in the labels array."all"
. This will use all columns of the labels array.For example if you have 3 labels per element and you want to use all of them:
MultiLabelTrainer(label_hierarchy_level="all")
Next, to test multi label datasets, simply use the GlobalEmbeddingSpaceTester
, but set label_hierarchy_level
to the desired mode.
from pytorch_metric_learning.testers import GlobalEmbeddingSpaceTester
tester = GlobalEmbeddingSpaceTester(label_hierarchy_level="all")
Note that I haven't tested the above code, and this is one of those features that hasn't been tested much, so there may be bugs!
If you're looking for more sophisticated multi-label training methods and would like them to be implemented, feel free to open a separate issue.
Hi ! First of all, thanks for this awesome library. I try pytorch_metric_learning in my project, but the accuracy is low. In my project, there are thousands of classes and most of the classes only have no more than 5 samples. I noticed that when i only keep the classes that have more than 50 samples, the accuray is satisfactory. So, in general, how many samples per class are needed for training? Is there any metric learning methods developed for few shot samples? Thanks。