Open Koeng101 opened 1 year ago
What does it even mean tho
It looks like the string <155222
was tried to be parsed as an integer, which it isn't since it contains the string "<", which is non-numerical. Looks like an off-by-one error when acquiring the integer string.
I know what the code means, but it is pretty unclear what it biologically means. All 3 of those are referring to the same gene/mRNA/CDS... but each one uses a different location string - and it looks like the gene at least is lossy.
<155222..>155765
doesn't make sense because it isn't say where the gene actually does start (like with join(<155222,155311..>155765)
, which basically says there is an intron from 155222 to 155311, and then from 155311 to 155765 there is a gene). The better way to write that would be join(155222,155311..155765)
, but semantically I think they mean the same thing.
Status update on this? Does it still need fixing?
I don't think it has been fixed. It does need fixing
I think the difficult part here is parsing out the join properly - without keeping a map of locus_tags, I'm not sure you can even parse <155222..>155765
properly, at all. It doesn't contain all the information necessary get the sequence out. We could also just accept that it is fucked up, and not try to fix it all. I kinda like that solution. Here is what snapgene displays:
I personally think this is a fine solution so long as we note it somewhere. We should probably have a note somewhere in the file of all the location exception cases we find.
Should be fixed in #394 @Koeng101?
Probably not. I think the time to fix this would be after the merge of ioToBio.
This issue has had no activity in the past 2 months. Marking as stale
.
This will be fixed once #437 is merged as a part of #434 .
To clarify, the <
and >
syntax indicate that the sequence is unbounded, i.e. <155222..>155765
indicates the sequence starts before base 155222 and ends after base 155765.
the
<155222
is confusing parseLocation.2023/05/29 10:24:00 Failed to parse ix.gb with err: strconv.Atoi: parsing "<155222": invalid syntax
What does it even mean tho
https://www.ncbi.nlm.nih.gov/nuccore/NC_001141.2