theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

[IRMA] bug in output FASTA file for HA segment of Flu B samples is empty #437

Closed kapsakcj closed 2 weeks ago

kapsakcj commented 2 months ago

:bug:

:pencil: Describe the Issue

I've observed this with Flu B samples, where the output FASTA file for the HA segment is an empty file. I think there's something going on with the IRMA WDL task that needs to be adjusted.

I don't think this occurs with Flu A samples, as I've seen those outputs being produced correctly from a few different labs using TheiaCoV_Illumina_PE v2.0.0

:repeat: How to Reproduce

:fishing_pole_and_fish: Expected Behavior

I expect the HA segment FASTA file to include sequence and nextclade to appropriately run on this file and produce the expected nextclade outputs (aa_subs, aa_dels, clade, etc.)

:floppy_disk: Version Information

occurs on v2.0.0

:information_source: Additional Information

There are a few other known issues/optimizations to be done for IRMA and Flu-related analysis, so this fix could be coupled with those

https://github.com/theiagen/public_health_bioinformatics/issues/426

https://github.com/theiagen/public_health_bioinformatics/issues/412

https://github.com/theiagen/public_health_bioinformatics/issues/409

Along with updating to IRMA v1.1.5 docker image which is now available on dockerhub here: https://hub.docker.com/r/cdcgov/irma/tags