theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
37 stars 17 forks source link

[Mercury_Prep_N_Batch] add state to country #399

Closed sage-wright closed 5 months ago

sage-wright commented 6 months ago

This PR closes #396

I personally haven't tested this but I am trusting in pandas

🗑️ This dev branch should be deleted after merging to main.

:brain: Aim, Context and Functionality

Adds state to the country column in accordance with new NCBI style

:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made

This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : No

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No

:clipboard: Workflow/Task Step Changes

🔄 Data Processing

Docker/software or software versions changed:

Databases or database versions changed:

Data processing/commands changed:

File processing changed:

Added state to country if state is provided

Compute resources changed:

➡️ Inputs

⬅️ Outputs

:test_tube: Testing

Test Dataset

Commandline Testing with MiniWDL or Cromwell (optional)

Terra Testing

Suggested Scenarios for Reviewer to Test

Theiagen Version Release Testing (optional)

:microscope: Final Developer Checklist

🎯 Reviewer Checklist

🗂️ Associated Documentation (to be completed by Theiagen developer)

kevinlibuit commented 6 months ago

Giving an approval from my end as I can confirm that, if a state is defined in the input table, this PR will update the GenBank country field to include the state value as {country}:{state}.

Based on the guidance provided by NCBI (posted on #396), this seems in line with desired behavior, but I'd like to see this confirmed by a successful submission--tagging @emily-smith1 for this one. Emily, please confirm successful submissions prior to merging.

emily-smith1 commented 5 months ago

The workflow ran successfully in Terra on a set of 18 samples, but the source modifier file failed the upload to GenBank.

image

I was able to resolve the GenBank submission error by removing the "state" column. Can this step be added to the workflow?