Closed danpolanco closed 5 months ago
I did a fresh install of Nextclade via conda (conda install -c bioconda nextclade
) and it isn't as up to date as what we are using in the WDL.
rsv_pipeline) ➜ cdphe-sars-cov-2 git:(fix/nexclade/ref_argument) nextclade dataset list --names-only
error: unexpected argument '--names-only' found
tip: a similar argument exists: '--name'
Usage: nextclade dataset list <--name <NAME>|--search <SEARCH>>
For more information, try '--help'.
I then noticed that the documentation says "Note that new versions may appear on bioconda with some delay (hours to days). This is due to long submission and approval cycle of bioconda. We recommend using standalone installation or Docker containers for most up-to-date versions."
So for testing, I'm going to use the Docker version.
The docker version (docker pull nextstrain/nextclade:latest
) also complains the --names-only
flag is invalid and has the same version as the conda version (3.0.0).
Here is nextclade dataset list
:
Which means we now need to use:
--name='nextstrain/sars-cov-2/wuhan-hu-1/proteins'
Using the --name='nextstrain/sars-cov-2/wuhan-hu-1/proteins'
fixed the download step, but now I can see there are more issues downstream by the change to nextclade 3.0.0:
Traceback (most recent call last):
File "/cromwell_root/terra_workspace_references/covid/nextclade_json_parser.py", line 219, in <module>
extract_variant_list(json_path = nextclade_json, project_name = project_name, workflow_version = workflow_version)
File "/cromwell_root/terra_workspace_references/covid/nextclade_json_parser.py", line 78, in extract_variant_list
gene=item['gene']
KeyError: 'gene'
I'm assuming the nextclade output changed format which broke nextclade_json_parser.py
.
That means it might be better to use the previous version of nextclade for this week's WWT results and then fix this issue.
There are many more changes mentioned in the Nextclade V3 Migration Guide, so for now, to fix this issue we are going to hardcode Nextclade 2.14.0 into the lineage calling WDL.
Nextclade removed the
--reference
flag:https://github.com/CDPHE-bioinformatics/CDPHE-SARS-CoV-2/blob/e73df95c0ba4cdca7c7ea8e6561b70279879e2b9/workflows/SC2_lineage_calling_and_results.wdl#L191