Defining new threshold for new dataset

Mshz2 commented 3 years ago

Hi there, thanks for sharing your codes publicly. I am trying to follow the custom dataset workflow. My question is about defining my own threshold based on required prec/recall, as you mentioned in here. I am quite newbie and would like to use your repo for my study thesis. My I ask for quick clue where in the code and how can I set a new threshold?

After training with spade method, I test the model on an anomaly test image and got the img anom score: tensor(8.3064), while I obtained even similar value with a good test image.

btw, as you mentioned I defined a custom dataset folder, but do we need to also define a new dataloader? as the already utilized the one in the run.py is for MVTecDataset.\ train_ds, test_ds = MVTecDataset(cls).get_dataloaders() My dataset folder structure is like: -custom_dataset --train ---good --test ---good ---bad --ground_truth ---bad

I really appreciate your help and effort.

rvorias commented 3 years ago

For getting the thresholds out, you would like to edit evaluate such that it gives back the whole array and not derivative statistics. Then you can select a recall your problem requires, and find the best threshold.

Since a recent commit, fit and evaluate require dataloaders. However, you can simply go from dataset to dataloader with DataLoader(dataset). If you have a custom dataset, like the way you described, you can just change the cls argument. It should be able to handle it.

Mshz2 commented 3 years ago

For getting the thresholds out, you would like to edit evaluate such that it gives back the whole array and not derivative statistics. Then you can select a recall your problem requires, and find the best threshold.

Since a recent commit, fit and evaluate require dataloaders. However, you can simply go from dataset to dataloader with DataLoader(dataset). If you have a custom dataset, like the way you described, you can just change the cls argument. It should be able to handle it.

Do you mean, instead of getting image_rocauc as an output, I just directly make evaluate function to return the image_preds (from image_preds = np.stack(image_preds)) ? if yes, I got an numpy array of 107 values (size of test set) [4.7874627 4.384495 8.509696 4.905506 4.4080386 4.3098564 5.8375764 5.2626934 5.263598 5.3024836 5.1988873 5.6746025 4.7221503 4.485909 4.759518 4.0496125 7.345115 7.472094 3.9519598 3.7376938 4.1330442 3.9398603 4.3360925 4.303591 4.8957515 4.1697164 4.018198 4.0983796 3.688304 3.2342405 5.1472163 4.6503997 4.656996 6.066843 3.8892963 3.6408505 5.561006 5.4095693 5.216195 5.569724 4.770916 5.4564643 4.851271 4.322129 4.913537 4.789391 4.8168287 5.407167 5.705127 6.0584974 6.1904693 4.6677184 4.448121 4.224089 4.9351726 4.4161773 4.701257 7.9415574 7.4025264 6.6581464 8.2070265 8.050845 7.382006 8.023618 8.318314 8.278881 9.036888 9.441163 9.099567 5.9304852 4.540201 4.864553 4.3083167 4.3588 5.021438 4.2260695 4.7118635 4.3773685 4.557448 4.877237 4.7699475 5.0105505 5.269314 4.680409 6.405204 6.0624905 6.1979065 6.203789 6.397243 6.1428957 6.2205157 5.8993645 5.1152725 4.6504455 4.8602114 5.4992085 4.918514 6.23922 6.390639 5.522263 5.8547907 5.9339247 5.653505 5.7642674 5.302512 8.766179 5.911414 ] Excuse me, I really ask because I am newbie to this topic, and I appreciate your help a lot: As we can see in the array above for all my 107 test images (105 are good and 2 images are defect samples), the values are mostly between 3 and 6. Then, five samples have score about 7, and seven samples got score of approx 8, and finally 3 of them had score above 9.x. Based on this, (U mentioned: "you can select a recall your problem requires,..."), my best threshold is for example 8.8? If I am not wrong, these 107 values are anomaly score of test samples for spade method that I selected with K=50. Do I need to set a threshold or recall value in the code somewhere? or we just judge the testing, and see if an image got value above a threshold we considered them as anomaly?

rvorias commented 3 years ago

I'm sorry, I might have put you on the wrong track. Evaluate gives back auroc stats, which is the Area Under the Curve. Not THAT useful.

You should look in normal RoC: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html#sklearn.metrics.roc_curve

It returns True Positive Rate, False Positive Rate and Thresholds. You can then select on of the rates you like. Imagine it is at index=5. Then you should take the threshold[5] for the threshold that gives that tpr/fpr.

Hope it's clear now!

rvorias / ind_knn_ad

Defining new threshold for new dataset #12