MDU-PHL / pango-collapse

app to collapse Pango lineages for reporting
https://mdu-phl.github.io/pango-collapse/
GNU General Public License v3.0
10 stars 1 forks source link

Collapse file Usage #12

Closed jwarnn closed 7 months ago

jwarnn commented 8 months ago

I am using pango-collapse as part of a nextflow workflow that does not have access to the internet. In the file main.py line 121 if a collapse file is not provided on the command line using the --collapse-file; the file is downloaded from github. This is redundant as the collapse.tx file is apart of package distribution that is downloaded from when installed with pip. I think it would make sense just to have the program use this file and keep it updated with releases then redownloading the file.

Wytamma commented 7 months ago

Hi @jwarnn,

Sorry I missed this issue.

The collapse_file will only be downloaded if you use --latest or if you provide a url to download the collapse file. Otherwise the internal collapse file is used by default see here -> https://github.com/MDU-PHL/pango-collapse/blob/main/pango_collapse/main.py#L119-L125C76.

This is done so that out of date pango_collapse installs can still use the latest collapse file (by using the --latest to stay up to date).

If you don't have internet access then you won't be able to use the --latest flag. Simply create a collapse file and pass it to pango_collapse. You could use the internal collapse file but I doubt you want to collapse all the same lineages as the default file.

Wytamma commented 7 months ago

Ah just tried running offline and it's actually an issue with downloading the alias_key not the collapse_file. The alias_key is required to be up to date because it contains the hierarchical information about the lineages. I will add an offline fallback, however, this may result in lineages not being collapsed if the alias_key file is out of date.

Wytamma commented 7 months ago

Published in v0.7.3