Closed victorlin closed 2 months ago
Thanks! Is the linked issue correct? Shouldn't it close #1598? Rather than #1588
Thanks, it should close both. I've updated the PR description.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 70.99%. Comparing base (
47c83e0
) to head (d73dbee
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
There's a reason the original calculation is the way it is. I don't think that's a bug or unnecessary.
Imagine there are 90 groups with 1 sequence and 10 groups with 1000. We want to sample 1000 sequences.
The original calculation would pick around 91 sequences per group.
Yours now picks 10 per group.
The original resulted in around 1000 sampled sequences.
Your calculation now in only 190.
@corneliusroemer that scenario is not affected by the changes here. I've responded in more detail at https://github.com/nextstrain/augur/issues/1588#issuecomment-2307563375
I meant to release this as part of 25.4.0 but forgot to merge 🤦 it will come in the next release.
Description of proposed changes
The previous
_calculate_fractional_sequences_per_group()
was an approximation of this exact value. The approximation could return a fractional value above 1, which would fail the assertion inget_probabilistic_group_sizes()
.Related issue(s)
Checklist