nextstrain / forecasts-ncov

SARS-CoV-2 variant growth rates and frequency forecasts
https://nextstrain.org/sars-cov-2/forecasts/
7 stars 2 forks source link

Model workflow failing #104

Closed joverlee521 closed 1 month ago

joverlee521 commented 1 month ago

TODOs

Context

The automated model workflows have been failing since Wednesday, July 3rd with errors from the modify-lineage-colours-and-order.py script

[batch] [2024-07-04T02:41:20+00:00] Traceback (most recent call last):
[batch] [2024-07-04T02:41:20+00:00]   File "/nextstrain/build/./scripts/modify-lineage-colours-and-order.py", line 126, in <module>
[batch] [2024-07-04T02:41:20+00:00]     data['metadata']['variants'] = order_lineages(data['metadata']['variants'][0:-1], aliasor) + [data['metadata']['variants'][-1]]
[batch] [2024-07-04T02:41:20+00:00]   File "/nextstrain/build/./scripts/modify-lineage-colours-and-order.py", line 25, in order_lineages
[batch] [2024-07-04T02:41:20+00:00]     return sorted(lineages,key=_lineage_sortable)
[batch] [2024-07-04T02:41:20+00:00]   File "/nextstrain/build/./scripts/modify-lineage-colours-and-order.py", line 24, in _lineage_sortable
[batch] [2024-07-04T02:41:20+00:00]     return "/".join([(f"{x:>3}" if i==0 else f"{int(x):03}") for i,x in enumerate(lin_full.split('.'))])
[batch] [2024-07-04T02:41:20+00:00]   File "/nextstrain/build/./scripts/modify-lineage-colours-and-order.py", line 24, in <listcomp>
[batch] [2024-07-04T02:41:20+00:00]     return "/".join([(f"{x:>3}" if i==0 else f"{int(x):03}") for i,x in enumerate(lin_full.split('.'))])
[batch] [2024-07-04T02:41:20+00:00] ValueError: invalid literal for int() with base 10: '1)'
joverlee521 commented 1 month ago

Error comes from 24A (JN.1) being in the clade column of the seq counts files on S3. Looks like there's Nextstrain clades in the pango_lineages seq counts files, so something went wrong in ingest.

joverlee521 commented 1 month ago

The seq counts files are generated with summarize-clade-sequence-counts which only groups the data, so looking into ncov-ingest to see why there are Nextstrain clades showing up in the original Nextclade_pango column.

joverlee521 commented 1 month ago

This should resolve itself once https://github.com/nextstrain/ncov-ingest/issues/456 has been resolved. Will check tomorrow.

joverlee521 commented 1 month ago

Verified model runs for GISAID and open completed successfully.