Current include/exclude settings don't allow for countries to be force included.
This PR adds that ability in a symmetrical way to exclude.
Because we of course don't want to force include countries that have very little data, a force include is ignored if it satisfies the normal include by <10%.
I guess instead of having a simple force include, one could have a dict that allows certain countries to have a laxer include requirement: e.g. Brazil could be included if it has at least 40% of the required sequences of non-force included countries.
I played a bit with settings and these seem quite reasonable to me. Notably, we should extend the window within which min-seqs are counted as many very important countries (India, South Africa, Brazil) produce sequences with a significant delay. That doesn't make them unuseful for the analysis, I don't think we should be as strict as we currently are.
Can you look at this @trvrb, @joverlee521, @jameshadfield? It would be nice to have some important countries force included. Current country selection has nothing from South America nor from Africa.
Description of proposed changes
Current include/exclude settings don't allow for countries to be force included.
This PR adds that ability in a symmetrical way to exclude.
Because we of course don't want to force include countries that have very little data, a force include is ignored if it satisfies the normal include by <10%.
I guess instead of having a simple force include, one could have a dict that allows certain countries to have a laxer include requirement: e.g. Brazil could be included if it has at least 40% of the required sequences of non-force included countries.
I played a bit with settings and these seem quite reasonable to me. Notably, we should extend the window within which min-seqs are counted as many very important countries (India, South Africa, Brazil) produce sequences with a significant delay. That doesn't make them unuseful for the analysis, I don't think we should be as strict as we currently are.
Here's how the new settings would look: