Open corneliusroemer opened 2 years ago
I think this issue arose as part of this Slack conversation. @corneliusroemer, am I correct in this?
(1 year later...)
The annotations schema now requires 'nuc' to be present (d6246ca052478446f7179e230e842a34f93e4cd4) however neither augur ancestral
nor augur translate
validate their outputs. Reading any node-data file (via NodeDataReader
) with an "annotations" block will also validate against the schema, although in this case that's still going to be first encountered in augur export v2
.
Conceptually we could have the annotations from ancestral
define 'nuc' and translate
define the CDSs, and they'll be merged in augur export
, however I think it's sensible to require translate
to add a 'nuc' block, which is why I made it a required property. If augur export
sees multiple annotations.nuc
entries it should really ensure they are the same length! (The JSON merging happens within NodeDataReader
)
Just a note, I ran into this issue working on my PRRSV dataset (https://github.com/mazeller/NextClade_Datasets/tree/main/prrsv_yimim_v3). I needed to append the following line to my GFF manually.
DQ478308.1 Genbank source 1 603 . + . locus_tag=nuc
however I think it's sensible to require translate to add a 'nuc' block, which is why I made it a required property
As of 1d17699e960d3805a0a586d7ccf3e9a550d53ac9 (in master, but not yet released) augur translate
will always export this. (I missed this issue when scanning, it's very similar to #953.)
Just a note, I ran into this issue working on my PRRSV dataset (https://github.com/mazeller/NextClade_Datasets/tree/main/prrsv_yimim_v3). I needed to append the following line to my GFF manually.
P.S. recent augur PRs (merged but not released) will fix this, we'll now read the nuc coords from the sequence-region pragma in your GFF ("##sequence-region DQ478308.1 1 603").
I've encountered a bug that took me very long to figure out. Augur export reported the following error:
Now it turns out, that export requires
nuc
annotations, and these come in usually throughaa_mut.json
fromaugur translate
.I was reading in annotations from a
.gff
into translate, something that's theoretically supported. However, it's actually not possible to read innuc
annotation in the current implementation.It would have very much sped up debugging if
augur translate
had warned me (or even errored) when it realised that it was lackingnuc
annotations.I'd propose an error if
nuc
not output intoaa_mut.json
:Related to #881