Open mmokrejs opened 9 months ago
I'm not sure I understand this comment. Again it would be helpful if you could provide attachments so that I could replicate the error.
In general, gofasta assumes that the alignment is in the same space as the reference sequence which is in the same space as the annotation.
If the alignment isn't in the same space as the annotation, the user is free to change the annotation (or the alignment) themselves. I don't want this to happen internally to gofasta because it's an unnecessary complication.
Let me know if I haven't understood your issue correctly.
Regarding the testcases, see https://github.com/virus-evolution/gofasta/issues/44#issuecomment-1936261317
Basically I don't have the reference CDS starting at position 1 but elsewhere. The data is from amplicon sequencing with a product spanning just a portion of the CDS region. So receiving K91 in the output is hard to interpret and basically I need to correct the position by a fixed offset. Nucleotide positions need to be adjusted by 978
, aa positions by 978 / 3.0 = 326
.
gofasta
can adjust the output values just before printing on STDOUT
. Otherwise one needs to parse the output and addjust by a different value nuc positions and by a different value aa positions.
If one decides to prepend N
s into the reference sequence to cover the region leftwards from the amplified region, gofasta
dies with Error: Translation error: Untranslatable codon: NNN
. At least if that was just a Warning
....
Hi, I wonder why you did not need to adjust the numbering of reported changes. I just want to add an arbitrary number to the nuc positions before the output is created.
To get around it, I padded the reference sequence with leading and trailing
N
s to get the nuc and aa positions reflecting their natural location in mRNA. The corresponding.gff
file also had to be tweaked to define the CDS region since position 1.Then
gofasta
failed with:So I had to rewrite
N
s witha
s to get around.Now my nuc and aa positions reported by
./gofasta variants
are as expected,K417
instead ofK91
, etc.Basically, instead of prepending 978
a
s I would have used--offset 978
. Should be simple to implement.