Closed j23414 closed 2 months ago
I was wondering why the CI was taking so long, then remembered that example files gets connected to "phylogenetic/data"
Fixed with: https://github.com/nextstrain/dengue/pull/47/commits/30b1d5a4860c1822b16f8d55b3e9e577455138e5 CI seems much faster
Description of proposed changes
In order to support gene phylogenetic trees (e.g. E gene trees), add rules to automatically generate gene reference GenBank and FASTA files (e.g.
reference_denv4_E.gb
andreference_denv4_E.fasta
) by following the rules used in RSV.This is part of a larger and older issue of creating E gene builds and is being split out into smaller PRs to maintain QC and scope of review. This will not generate an E gene phylogenetic tree, subsequent PRs will modify this to generate the trees.
Visual summary (view whole pipeline plan so far)
Related issue(s)
Checklist
Example shortened reference_denv2_E.gb
``` LOCUS DENV2/THAILAND/REFERENCE/1964 1485 bp DNA UNK 01-JAN-1980 DEFINITION Dengue virus 2, complete genome. ACCESSION NC_001474 VERSION NC_001474.2 KEYWORDS . SOURCE . ORGANISM . . FEATURES Location/Qualifiers CDS 1..1485 /gene="E" /db_xref="VBRC:35921" /product="envelope protein E" /protein_id="NP_739583.2" source 1..1485 /collection_date="1964" /country="Thailand" /db_xref="taxon:11060" /mol_type="genomic RNA" /organism="Dengue virus 2" /strain="16681" ORIGIN 1 atgcgttgca taggaatgtc aaatagagac tttgtggaag gggtttcagg aggaagctgg 61 gttgacatag tcttagaaca tggaagctgt gtgacgacga tggcaaaaaa caaaccaaca 121 ttggattttg aactgataaa aacagaagcc aaacagcctg ccaccctaag gaagtactgt ... 1381 gtcattatca catggatagg aatgaattca cgcagcacct cactgtctgt gacactagta 1441 ttggtgggaa ttgtgacact gtatttggga gtcatggtgc aggcc // ```