replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

Pangolin v4.0 #220

Closed hoelzer closed 2 years ago

hoelzer commented 2 years ago

There will be an update that changes the way pangoLEARN, etc... are used and aims to harmonize all that stuff into a single pangolin-data dependency:

Another thing that will impact this is that pangolin 4.0 is due to be released this week at some point, and that will change the versioning system to be less convoluted than the current one.

In pangolin 4.0, a new dependency pangolin-data will replace the current pangoLEARN and pango-designation dependencies. With this update, the pangolin-data version number will always be the same as the pango-designation version the model was trained on and the usher tree was built on. That removes the conflicting pango-designation versions reported and also removes the pangoLEARN dated versioning system, leaving only a single version that corresponds directly to the version of lineages used in the data.

https://github.com/cov-lineages/pangolin/issues/386#issuecomment-1082088001

We should be aware of that and might have to change some code (version numbers, report, ...) when this update happens.

replikation commented 2 years ago

yep this might need some adjustments on final report @RaverJay but will simplify the overview

hoelzer commented 2 years ago

Example output of the new version:

https://github.com/cov-lineages/pangolin/issues/390

hoelzer commented 2 years ago

Great overview of the changes: https://cov-lineages.org/resources/pangolin/pipeline.html

RaverJay commented 2 years ago

Looks great, this will simplify things a lot

hoelzer commented 2 years ago

And a nice thread summarizing the main changes again

https://twitter.com/AineToole/status/1509876534529638411

RaverJay commented 2 years ago

@replikation when will the container be ready? :)

hoelzer commented 2 years ago

@replikation when will the container be ready? :)

I guess at the moment when the new pango v4.0 is on bioconda :D

replikation commented 2 years ago

No we are building from git directly to avoid delays @hoelzer. but the auto builds have a few checks to not cause an issue on poreCov (e.g. are the columns present - and it currently tags via pangolin and pangolearn version). this might be a bit annoying now to switch the whole thing in combination with porecov. (e.g. --update might break now older poreCov versions). SO i might need to fork and create a "pangolin4" container going forward or so. need to check first

Edit: yes the column headers are changed. need to create a new pangolin fork to avoid issues

replikation commented 2 years ago

@RaverJay i guess the html report creates an error if it cant get the correct versions?

hoelzer commented 2 years ago

uhh, you're right @replikation I thought we build from bioconda. Ah yes, and now basically older versions of poreCov can only use pangolin containers until a certain release. Okay, that's a bit annoying -.-

RaverJay commented 2 years ago

@RaverJay i guess the html report creates an error if it cant get the correct versions?

Yes, it will fail when the columns it needs are not there