Version 1.26 of the GFF3 spec mentions that circular features can be encoded in a GFF3 file by setting the end coordinate of such a feature to a position greater than the rightmost position in a contig.
We currently don't support this sort of feature in our code, and will raise an error if we see something like this. FWIW, prodigal's gene predictions on the SheepGut dataset don't have this problem at all (although this might be a result of us using the -c option).
Anyway, handling this sort of case is definitely feasible, but will require a bit of extra work. So I'm putting this issue on the backburner for now, in lieu of more important issues; I can address this if there is desire for it.
Things to do to implement support for circular features
[ ] Replace set(feature_range) with just a set of all positions in these features (probably makes sense to create two ranges -- positions from feature start to contig end, and from contig start to feature start -- and merge these into a single set)
[ ] Check for the weird case where the end loops around the contig more than once, and raise an error in this case. Given a 1-indexed start s coordinate in the range [1, n] (for a contig of length n), the only valid "circular" end coordinates should be in the range [n + 1, n + s - 1]. (Anything past that, and positions would start being represented more than once in this feature.)
Version 1.26 of the GFF3 spec mentions that circular features can be encoded in a GFF3 file by setting the end coordinate of such a feature to a position greater than the rightmost position in a contig.
We currently don't support this sort of feature in our code, and will raise an error if we see something like this. FWIW, prodigal's gene predictions on the SheepGut dataset don't have this problem at all (although this might be a result of us using the
-c
option).Anyway, handling this sort of case is definitely feasible, but will require a bit of extra work. So I'm putting this issue on the backburner for now, in lieu of more important issues; I can address this if there is desire for it.
Things to do to implement support for circular features
[ ] Replace
set(feature_range)
with just a set of all positions in these features (probably makes sense to create two ranges -- positions from feature start to contig end, and from contig start to feature start -- and merge these into a single set)[ ] Check for the weird case where the end loops around the contig more than once, and raise an error in this case. Given a 1-indexed start s coordinate in the range [1, n] (for a contig of length n), the only valid "circular" end coordinates should be in the range [n + 1, n + s - 1]. (Anything past that, and positions would start being represented more than once in this feature.)
[ ] Test