cidgoh / nf-ncov-voc

A Nextflow wrapped workflow for generating the mutation profiles of SARS-CoV-2 genomes (Variants of Concern and Variants of Interest). Workflow is developed in collaboration with COVID-MVP (https://github.com/cidgoh/COVID-MVP) which can be used to visualize the mutation profiles and functional annotations.
MIT License
6 stars 5 forks source link

Use spec format for functional annotation, add new pragmas to GVFs #156

Closed miseminger closed 6 months ago

miseminger commented 6 months ago

note: functionalannotation.py leaves the 'PMID' field empty. PMIDs are added later in a two-part process: 1) run dois2pmcids.sh to get a CSV of the PMIDs and PMCIDs according to DOI (uses the online NCBI converter) 2) run addpmids2functionalannotation.py to add the PMIDs from that CSV to the functional annotation TSV Adding the PMIDs isn't needed for anywhere else in the pipeline except for having the functional annotation file complete, so the workflow will still run even without doing 1) and 2).

In the future this could be streamlined into fewer scripts if needed, though I think once we have the PMIDs in Pokay, we could parse them from there instead of using the NCBI tool.