Closed anewusername77 closed 3 years ago
but i don't have labels
Hi @scarletteshu,
Thank you for your interest.
Yes. You need to write your own dataset (e.g. data/cifar.py). Please refer to the following issues: #8, #19, #34. They might be useful. Also, since you don't have labels available, you will have to remove the evaluation code.
Hi @scarletteshu,
Thank you for your interest.
Yes. You need to write your own dataset (e.g. data/cifar.py). Please refer to the following issues: #8, #19, #34. They might be useful. Also, since you don't have labels available, you will have to remove the evaluation code.
thanks a lot! i'm new to this, i'll ask you again if i got any more problems. thanks again~
dear author, my new questions are as follows:
self.targets=[]
and self.classes=[]
to a constant value(targets=[[0],[0],...], self.classes=['01', '02', ...]
), will it influences the training?
since the running code need these values but i don't have ground truth labels, can't just remove them.
can i just remain evaluation part? since evalutation should not change model states and final resultsquestion two:
when i remove evaluation code:
in moco.py
:
# Mine the topk nearest neighbors (Validation)
# These will be used for validation.
'''
topk = 5
print(colored('Mine the nearest neighbors (Val)(Top-%d)' %(topk), 'blue'))
fill_memory_bank(val_dataloader, model, memory_bank_val)
print('Mine the neighbors')
indices, acc = memory_bank_val.mine_nearest_neighbors(topk)
print('Accuracy of top-%d nearest neighbors on val set is %.2f' %(topk, 100*acc))
np.save(p['topk_neighbors_val_path'], indices)
'''
then there will be no topk_neighbors_val
file
but in scan.py:
# Evaluate
print('Make prediction on validation set ...')
predictions = get_predictions(p, val_dataloader, model)
print('Evaluate based on SCAN loss ...')
scan_stats = scan_evaluate(predictions)
print(scan_stats)
lowest_loss_head = scan_stats['lowest_loss_head']
lowest_loss = scan_stats['lowest_loss']
if lowest_loss < best_loss:
print('New lowest loss on validation set: %.4f -> %.4f' %(best_loss, lowest_loss))
print('Lowest loss head is %d' %(lowest_loss_head))
best_loss = lowest_loss
best_loss_head = lowest_loss_head
torch.save({'model': model.module.state_dict(), 'head': best_loss_head}, p['scan_model'])
else:
print('No new lowest loss on validation set: %.4f -> %.4f' %(best_loss, lowest_loss))
print('Lowest loss head is %d' %(best_loss_head))
print('Evaluate with hungarian matching algorithm ...')
clustering_stats = hungarian_evaluate(lowest_loss_head, predictions, compute_confusion_matrix=False)
print(clustering_stats)
there is torch.save()
in evaluate part.
if i remove them in scan.py, will it influence the saving model?also, if not, scan.py will raise error "cannot find topk_neighbors_val file "
expecting your response~(sorry to have so many questions)
Hi @scarletteshu,
Yes, you will have to modify the code. If you don't have labels, you can't compute the accuracy. You can remove that part. The validation loss is used to select the best model. You can define your own validation set or take the final model.
thanks for your reply,
when I trained cifar10, losses were like
consistency loss 8.5809e-01 entropy 2.3005e+00
but when I trained my own dataset, consistency loss was always close to entropy, and predctions['probabilities'] were close to each other (such as 0,1001, 0,1012,...), what do you think the problem is?
I only changed transforms
as ours
and learning rate
in config file, comparing to scan_imagenet_50.yml
Hi @scarletteshu,
Hard to say what the problem is exactly. Especially since I don't know the dataset. However, lowering the weight in the loss will likely help.
If there are still issues let me know. Closing this for now.
its image datset without labels, should i create it like imagenet-style datasets? i mean images of different labels in different folders