theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
34 stars 16 forks source link

[Mercury] potential issue with mercury_prep_n_batch with clearlabs data #309

Closed kapsakcj closed 1 month ago

kapsakcj commented 6 months ago

:bug:

:pencil: Describe the Issue

A user notified us that after running the Mercury_prep_n_batch workflow on a set of SC2 genomes sequenced on a clearlabs platform, the SRA metdata sheet and GISAID metadata sheet disagreed in two columns.

For the SRA metadata, the column raw_sequence_data_processing_method was populated with TheiaCoV (PHB v1.2.1): Medaka via artic 1.3.0

For the GISAID metadata, the column covv_assembly_method was populated with Clear Dx SARS-CoV-2 WGS v3.0

:repeat: How to Reproduce

I can provide the terra workspace, workflow, & affected set privately.

:fishing_pole_and_fish: Expected Behavior

The user expected both columns to state Clear Dx SARS-CoV-2 WGS v3.0 since they were submitting CL-generated FASTA files, not FASTA files generated via TheiaCov workflows.

:floppy_disk: Version Information

Mercury_prep_n_batch v1.2.1

:information_source: Additional Information

I'm not intimately familiar with this workflow, so I don't have a recommended solution, but wanted to document this issue so that we can work on a solution (whether that involves a code change or another solution)

kapsakcj commented 1 month ago

Closing as this issue was resolved by editing metadata TSV files prior to submission to GISAID & SRA. User has not communicated about this issue in >1 year