Closed marcelm closed 1 year ago
I thought about adding a hard-coded (100%)
after "Unique strobemers" to clarify what the other percentages refer to. What do you think? Like this:
Index statistics
Total strobemers: 26446802
Unique strobemers: 23324540 (100.00%)
1 occurrence: 22560298 ( 96.72%)
2..100 occurrences: 762681 ( 3.27%)
>100 occurrences: 1562 ( 0.01%)
But maybe it’s confusing that this is always 100%?
I like the clarification.
I thought about adding a hard-coded (100%) after "Unique strobemers"
Okay, but in that case perhaps we should change to Distinct strobemers
or Total distinct strobemer hash values
instead of Unique
?
Okay, but in that case perhaps we should change to
Distinct strobemers
orTotal distinct strobemer hash values
instead ofUnique
?
Do you suggest this because "unique" could also refer to the strobemers with one occurrence? That’s a good point. I have changed it to "distinct" now. The longer version doesn’t fit well and the index statistics are only shown when -v
is used, so I think it’s fine. (We could at some point add a section to the documentation explaining what the numbers mean and how to interpret them.)
Do you suggest this because "unique" could also refer to the strobemers with one occurrence?
Yes, that is what I was referring to.
Ok great. good to merge.
This is also part of #298, but I took the opportunity to make the index statistic logging a bit nicer, which I had wanted to do for a while. Before:
After: