Closed richelbilderbeek closed 2 years ago
Hello
The average reported there is the micro-averaged F1 score, calculated by this function: here.
It is calculated globally over the classes based on the total true positives, false negatives and false positives, and can't be derived from the numbers in this table alone. This is why we chose to print it there, whereas the macro-average and weighted average can be calculated from the per-class F1 scores in this table.
I'm not sure of the utility of this measurement for this particular application, in our paper we reported the weighted average F1 score over the classes.
I see how this is confusing though, I will write an explanation in the README to document the behaviour.
Thanks for bringing it up, K
Thanks for clearing that up :+1:
Dear GenoCAE maintainers, hi @cnettel and @kausmees,
As you are back, I have found the following (here discussed from my point of view). Here I submit something I found unexpected. If you also did not expect this, I'd happily create a minimally reproducible example.
When using
evaluate
with asuperpops
file, in one of my cases I got the following:The unexpectedness is in the last line, that suggests to calculate the average, but appears to do different things per column (and I understand for the first column (
num_samples
) to use a sum there :-) ).I would expect the averages to be:
I checked: these 'averages' are also neither the harmonic nor geometric mean.
What are those values?
If you think these are weird as well, I will happily create a reproducible example. Else, I am happy to learn what these values are :-)