replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

added skip-scorpio parameter fixes #234 #235

Closed MarieLataretu closed 2 years ago

MarieLataretu commented 2 years ago
replikation commented 2 years ago

what is the rationale behind it?

hoelzer commented 2 years ago

what is the rationale behind it?

This is a pangolin issue currently discussed in the community. The problem is that scorpio overwrites correct PangoLearn/Usher assignments. For example this happens for recombinants bc/ there are yet no (?) scorpio constellation files for recombinants. But this also happens for BA.4/5 which is even worse. E.g. see discussions here:

https://github.com/cov-lineages/scorpio/issues/47 https://github.com/cov-lineages/pango-designation/issues/713 https://github.com/cov-lineages/pango-designation/issues/645 https://github.com/cov-lineages/scorpio/issues/46 ...

We performed some in-house tests annotation the full DESH collection w/ and w/o scorpio and the results w/o scorpio look more stable and accurate. That's why we want to add a parameter to deactivate scorpio and thus the overwrite of previous pangoLearn/User assignments.

There is also a not yet merged PR (https://github.com/cov-lineages/pangolin/pull/461) about that topic. So maybe the way scorpio works will be changed anyway in the (near) future. But for now, it would be good to have an option to deactivate it.

hoelzer commented 2 years ago

@MarieLataretu but would it be not better to have a general --additional_pango_params ? In case also other options should be given to pangolin in the future...

EDIT: although checking the way you implemented it now is also nice. It's more explicit to have a parameter for deactivation directly in the help

MarieLataretu commented 2 years ago

@MarieLataretu but would it be not better to have a general --additional_pango_params ? In case also other options should be given to pangolin in the future...

EDIT: although checking the way you implemented it now is also nice. It's more explicit to have a parameter for deactivation directly in the help

Yes, it's more explicit and if --additional_pango_params is per default --skip-scorpio it's kind of hard to turn it off on the command line.

Of course --additional_pango_params would be more flexible.

replikation commented 2 years ago

but it sounds like it would be better to deactivate it by default or did i misunderstand this?

MarieLataretu commented 2 years ago

I wasn't sure about changing the default behavior - I can change that to skip scorpio per default, if you agree

hoelzer commented 2 years ago

Yeah, that's the question. We could also deactivate it by default and write that in the release notes (citing also the discussions I mentioned here)? Could be just that when the Pango devs change also the default behavior in the tool itself we also might have to change that back.

Assuming that many people just use the pipeline and report the results it might be even good we make this decision based on the current discussions and tests. So from my site also +1 to make this the current default

replikation commented 2 years ago

i mean most users are probably not reading this so i would just set the defaults to the best practice. :)

MarieLataretu commented 2 years ago

I changed the default to true. Skipping scorpio can be deactivated with --pangolin_skip_scorpio false

replikation commented 2 years ago

can we changed it to just --scorpio to activate it? its more understandable for the user

hoelzer commented 2 years ago

@replikation yeah, I like that it's not so DoppeltGemoppelt

btw the current release vs/ Maries master branch w/ scorpio deactivated: https://twitter.com/martinhoelzer/status/1537769594546442241

MarieLataretu commented 2 years ago

Done. Per default is --scorpio false, so --skip-scorpio is added to pangolin.

replikation commented 2 years ago

Perfect. I'll run some tests and then merge. Thank you