Closed asetGem closed 6 years ago
commit ce40896 implements
--topo-file
given--circ
, --linear
and --topo-file <filename>
options--circ
and --linear
are given togetherbut 8b38da1 install integron package needed for this feature
I have aquestion for the following feature
the topology will be set by --circ or --linear option or --topology-file but in code I read
# If sequence is too small, it can be problematic when using circularity
if len(SEQUENCE) > 4 * DISTANCE_THRESHOLD:
circular = not args.linear
else:
circular = False
so the topology set by the user can be override. in the results what value must appear the topology set by the user or the topology effectively used?
Well, I'm not sure. Ideally we shouldn't have such condition and let the algo behave normally. We should have a condition to stop in case the entire sequence is parsed or once an attC site is found for the second times, otherwise the expansion will never stops.
I think we can keep it like this for now, and in the column use the value used by the algo. The true topology is in the topology file. We should name this column "Topology_considered" or something similar to stress the fact that it's not the topology of the sequence but the topology used by IF.
I think it's really an edge case, but we should mention it on the doc if it's not already the case. If it turns out that many people have very small plasmids with integron over the edge, we might implement a proper solution later, or maybe use a lower value than 4 * DISTANCE_THRESHOLD
.
TL;DR: Use the value of the parameter in the output file and not the value of the topology file, and rename the column to stress the fact that it's the value of the parameter, not the actual state of the sequence.
Thanks!
Behaviour expected:
By default:
Available options:
--circ
used to set all replicons to circular (useful when multi-fasta provided. When fasta provided, it does not change anything from default...)--linear
used to set all replicons to linear (useful when fasta input. When multi-fasta, it is already the default behaviour)--topo-file <filename>
: the user can provide a file specifying the topology of the replicons not following the default or--circ
/--linear
option. 2 columns: replicon ID andcircular
orlinear
To sum-up. For each replicon, its topology is:
--circ
/--linear
option given, follow this optionAdd replicon topology information in the final tab file.
Steps
--topo-file
given--circ
,--linear
and--topo-file <filename>
options--circ
and--linear
are given together