valeoai / ConfidNet

Addressing Failure Prediction by Learning Model Confidence
Other
163 stars 35 forks source link

code bug? why i meet this when i test confidnet? #16

Closed Lyttonkeepfoing closed 7 months ago

Lyttonkeepfoing commented 1 year ago

File "/ConfidNet/confidnet/learners/default_learner.py", line 179, in evaluate metrics.update(pred, target, confidence) UnboundLocalError: local variable 'pred' referenced before assignment (I'm sure no changes were made to the code except the config.yaml) When i set --mode confidnet I will meet this error, but when i set --mode trust-score --mode mc-dropout --mode mcp everything is normal, and i can get the result.

At the same time, when i set --mode tcp , the result is wired . The E-AURC is 0 and the AURC is much lower than other method. Any explanation for this?

Hoping get your respond soon, It's a good work!

chcorbi commented 1 year ago

Hi @Lyttonkeepfoing,

If you are looking to train ConfidNet, you should be using the selfconfid_learner.py as shown in the example config file https://github.com/valeoai/ConfidNet/blob/master/confidnet/confs/selfconfid-classif.yaml

default_learner.py can be used for computing MCP baseline or to compare with the 'golden' topline TCP. As the true class probability (TCP) is not supposed to be available in training, this is why it should considered only as this golden topline, which have nearly perfect results, hence explaining also your observation that it achieve low AURC.

Best, Charles

Lyttonkeepfoing commented 1 year ago

Thanks for responding! But I still have a question. Why not upload a selfconfid-cifar10-classifi.yaml and selfconfid-cifar100-classifi.yaml file directly? and What's the parameter setting in Cifar100 and other datasets? The optimizer and lr are different from cifar10. I can't get an accurate result if the parameters are different. I think it's a nice work and we want make it becoming a baseline for us. Your reply is really important for us!

chcorbi commented 1 year ago

The training of ConfidNet is similar regardless of the considered dataset: 500 epochs with Adam optimizer and learning rate 10e-4, dropout and same data augmentation used in classification training. Best model can be selected based on AUPR-Error on validation set.

Hence, in selfconfid-classif.yaml, you should only adapt the datablock and the augmentations entry. This is why there is only one file, whereas there are different config files for classification. Note also that a lot of improvements with ConfidNet come also with the second phase, fine-tuning the whole network.

Wishing you best of luck for your project

Lyttonkeepfoing commented 12 months ago

Thanks so much, I'm so sorry i got one more question. if "fpr_at_95tpr" in self.metrics: for i,delta in enumerate(np.arange( self.proba_pred.min(), self.proba_pred.max(), (self.proba_pred.max() - self.proba_pred.min()) / 10000, )): tpr = len(self.proba_pred[(self.accurate == 1) & (self.proba_pred >= delta)]) / len( self.proba_pred[(self.accurate == 1)] ) print(tpr, 'tpr') if i % 100 == 0: print(f"Threshold:\t {delta:.6f}") print(f"TPR: \t\t {tpr:.4%}") print("------") if 0.9505 >= tpr >= 0.9495: print(f"Nearest threshold 95% TPR value: {tpr:.6f}") print(f"Threshold 95% TPR value: {delta:.6f}") fpr = len( self.proba_pred[(self.errors == 1) & (self.proba_pred >= delta)] ) / len(self.proba_pred[(self.errors == 1)]) scores["fpr_at_95tpr"] = {"value": fpr, "string": f"{fpr:05.2%}"} break

                Here, sometimes the value of tpr is lower than 0.9495, then it will break. And the socres["fpr_at_95tpr"] become None. Then i met Key error bug. Do you know what's the problem? When i train on Cifar10, everything is normal. But when i train on Cifar 100, the tpr is really wired. like this 0.0025906735751295338 tpr
Lyttonkeepfoing commented 12 months ago

I think the key is this stride: (self.proba_pred.max() - self.proba_pred.min()) / 100000 Is there a solution that I don't need to change the "100000"?

chcorbi commented 12 months ago

Yes, you can edit here the stride, 100000 was a a good trade-off between running time and accurateness to compute the TPR