Oropouche is a segmented virus (L, M, and S segments). In order to accommodate these different segments and to allow for downstream phylogenetic analysis, the ingest pipeline was customized to split up the metadata based on segment, as well as a metadata and sequences file with all the sequences under results/all
This was adopted from the work done by @j23414 in nextstrain/lassa#12
I really quickly compared the segment assignments done by nextclade with the already existing annotations found on NCBI, and it seems to be quite concordant with all the genomes that are annotated as L and M in NCBI being also assigned as L and M respectively by Nextclade.
There are two genomes that are annotated as S but were not assigned as such by nextclade and a quick look show that theyre both from culex mosquitos and pretty short so the sequencing quality might not be great to begin with. I can look into that a bit better in the future
there were about 13% about the genomes that didnt have a segment annotation and nextclade and nextclade was able to assign a segment to all except 7. Below is their information, they're just really short segments so makes sense that nextclade would struggle.
oropouche_no_nextclade_segment_assignment.csv
It all runs perfectly thanks to @j23414 's work on the lassa side.
Oropouche is a segmented virus (L, M, and S segments). In order to accommodate these different segments and to allow for downstream phylogenetic analysis, the ingest pipeline was customized to split up the metadata based on segment, as well as a metadata and sequences file with all the sequences under
results/all
This was adopted from the work done by @j23414 in nextstrain/lassa#12
I really quickly compared the segment assignments done by nextclade with the already existing annotations found on NCBI, and it seems to be quite concordant with all the genomes that are annotated as L and M in NCBI being also assigned as L and M respectively by Nextclade.
There are two genomes that are annotated as S but were not assigned as such by nextclade and a quick look show that theyre both from culex mosquitos and pretty short so the sequencing quality might not be great to begin with. I can look into that a bit better in the future
there were about 13% about the genomes that didnt have a
segment
annotation and nextclade and nextclade was able to assign a segment to all except 7. Below is their information, they're just really short segments so makes sense that nextclade would struggle. oropouche_no_nextclade_segment_assignment.csvIt all runs perfectly thanks to @j23414 's work on the lassa side.