ptrckhmmr / learning-to-defer-with-limited-expert-predictions

Code for "Learning to Defer with Limited Expert Predictions" (AAAI 2023)
8 stars 3 forks source link

Some questions about figure 2 in the paper“Learning to Defer with Limited Expert Predictions” #1

Closed hechengbo-H closed 1 year ago

hechengbo-H commented 1 year ago

Hello, excellent fellow, I found something I didn't quite understand, so could you help me with some answers? Thank you and your team very much.

The question is as follows:

First question: Is the baseline not shown because it is less than 90% accuracy?

Second question: Why is "complete expert prediction" always 100% accuracy?

Third question: Why do some lines in the graph (e.g., Embedding SVM in the upper right corner) exceed 100% accuracy.

image

lukasthede commented 1 year ago

Hello hechengbo-H,

Thank you for reaching out. I'm more than happy to address your questions.

[Q1] Yes, we opted to only showcase the higher of the two lower bounds (Human Expert Alone, Classifier Alone) in our figures to ensure clarity and ease of interpretation.

[Q2] In the particular figure you mentioned, we present our outcomes in relation to the accuracy of systems trained with a complete set of expert labels which represents the upper boundary. This choice enables us to make direct comparisons across various learning-to-defer systems. To achieve this, we normalize the accuracies of all approaches by dividing them by the accuracy of the corresponding system trained with a complete set of expert labels.

[Q3] While the learning-to-defer system's performance trained on a complete set of expert labels in our setting intuitively serves as an upper boundary, it's important to note that this isn't an absolute rule. Theoretically, though unlikely, scenarios exist where a system trained on artificial expert labels slightly outperforms the upper boundary. In our experiments, we observed this in some edge cases with a high number of available expert labels (e.g. l=5000).

I hope this clarifies your queries. Feel free to raise more questions if needed.

Best regards, Lukas Thede

hechengbo-H commented 1 year ago

Hi, I apologize for the inconvenience, but I am genuinely fascinated by your work. I have made an attempt to execute your task as per the instructions provided in your README.md file.

However, I have encountered an issue while running learning-to-defer-with-limited-expert-predictions-main/Embedding-Semi-Supervised/Train_emb_model.py.

image

The error in this image is at line "print(f'check: {self.train_data.images[10]})" in the file "learning-to-defer-with-limited-expert-predictions-main/Embedding-Semi-Supervised/feature_extractor/emb_model_lib.py". It displays an AttributeError: 'CIFAR100_Dataset' object has no attribute 'image'.

All the best,

Chengbo He

hechengbo-H commented 1 year ago

image

lukasthede commented 1 year ago

Hi,

thank you for pointing out this bug. For working with CIFAR100 changing self.train_data.images to self.train_data.data should solve your problem.

Best, Lukas Thede

hechengbo-H commented 1 year ago

Hi, Thank you for your suggestion. I have already commented it out. I also wanted to ask, which file was used to generate Figure 2 in the paper? The workload for this is really large, and there are too many things to consult. I am truly sorry. Best, chengbo he

lukasthede commented 1 year ago

Hi, The script for Figure 2 is not included in our repository but can easily be reimplemented using Matplotlib.

For Figure 2 we divide the accuracy of the learning-to-defer framework trained with artificial expert labels by the accuracy of the framework trained with complete expert annotations. We do this for each learning-to-defer framework and approach (i.e. Embedding-SL, Embedding-SSL, and SSL). We then plot these relative accuracies by the number of expert annotations used to train the artificial expert labels (i.e. l).

Best, Lukas Thede

hechengbo-H commented 1 year ago

Hi, I see. Thank you for your reply.

All the best, Chengbo He