Closed jwarnn closed 7 months ago
Hi @jwarnn,
Sorry I missed this issue.
The collapse_file
will only be downloaded if you use --latest
or if you provide a url to download the collapse file. Otherwise the internal collapse file is used by default see here -> https://github.com/MDU-PHL/pango-collapse/blob/main/pango_collapse/main.py#L119-L125C76.
This is done so that out of date pango_collapse
installs can still use the latest collapse file (by using the --latest
to stay up to date).
If you don't have internet access then you won't be able to use the --latest
flag. Simply create a collapse file and pass it to pango_collapse. You could use the internal collapse file but I doubt you want to collapse all the same lineages as the default file.
Ah just tried running offline and it's actually an issue with downloading the alias_key not the collapse_file. The alias_key is required to be up to date because it contains the hierarchical information about the lineages. I will add an offline fallback, however, this may result in lineages not being collapsed if the alias_key file is out of date.
I am using pango-collapse as part of a nextflow workflow that does not have access to the internet. In the file main.py line 121 if a collapse file is not provided on the command line using the
--collapse-file
; the file is downloaded from github. This is redundant as the collapse.tx file is apart of package distribution that is downloaded from when installed with pip. I think it would make sense just to have the program use this file and keep it updated with releases then redownloading the file.