Open kdg1993 opened 1 year ago
In the paper that ranked 2nd in the Chexpert benchmark, i found that various experiments on uncertained labels were conducted. (https://arxiv.org/abs/1911.06475)
A brief summary of the experimental setting is as follows, and the results table for the experiment is attached below.
- default setting: U-Ignore, U-Ones, U-Zeros
- additional policy:
- CT (conditional Training): Consider the hierarchical structure between labels
- LSR (label smoothing regularization): machine learning algorithm
Presumably, the application of policies that differ only in two columns in libauc seems to have chosen the most accurate one after conducting several experiments themselves.
Thank you so much for solving my question about the reason for converting labels @jieonh 👍
If I understand right what you shared, the table supports the reason for converting by validation score. I couldn't agree more that score based method is one of the concrete and well-supported way to choose a way to experiment.
However, in terms of providing convincing options for experiments to the users of our testbed, I think my suggestion about expanding the data converting option is still worth to do it.
So, I want to ask about the worthiness of doing this work first. Secondly, if it is worth it, want to know if anyone is interested in doing this. If worth but everyone is busy, then I think it's on me to do it 😄 Please let me know your opinion because it is totally fine and grateful to me to say a word of reply.
When I looked it up a little bit more, it seemed that there was a lot of research going on on about uncertaining quantification. I guess that's because the importance of data centric ai is emerging these days, so i agree that the process of further investigating the data itself is worthwhile.
I'm not sure if I can fully concentrate on that task for now, but I can assist you or do some research to catch up (if that will be anyhelp!).
+) Does anyone know the exact difference between uncertained labels(-1) and missing values(Nan)? I'm little bit confused.
In addition, I found a detailed datasheet for CheXpert for those of you who may be interested https://arxiv.org/pdf/2105.03020.pdf
You can refer to p3-6 for the info we are looking for (labeling protocol)! In summary, the labeling section in this sheet explains how the labels are assigned based on the keywords found in the report, and how 'No findings' labels are assigned (which fully addressed my concern about normal data)
+) I think this sheet might explain the difference btwn -1 and Nan labels @jieonh just asked. You can refer to Table 3 of the sheet which describes the label definitions.
What a nice reference that you shared @chrstnkgn! :satisfied:
I'm not fully read the whole paper yet but already solved some questions by what you shared! Especially, p2-5, fig 1, table 3 are thoroughly informative and helped my mind to sure about converting uncertain(-1) is worth doing for the similarity between train and validation set.
What
Why
While I've looked around the target class distribution of CheXpert CSV data, I found an interesting possibility for data handling. The figure below is a snapshot of target distribution by my personal exploration of CheXpert.
Meanwhile, our current custom Dataset class converts, (Not sure but I guess this way of converting is based on score)
Likewise, I think there are many ways to apply statistical or intuitive aspects of handling missing values in the traditional ML field. So, I want to discuss it and carefully ask for help to make this idea possible to use in our custom codes
FYI, I include the distribution of valid set just for sharing knowledge but I'm afraid that considering the validation set distribution might be connected to the data leakage issue. Probably everyone knows already but mentioned it just for reminding 😄
How