BjornFJohansson / pydna

Clone with Python! Data structures for double stranded DNA & simulation of homologous recombination, Gibson assembly, cut & paste cloning.
Other
160 stars 39 forks source link

Should numbers greater than len raise errors in __get_item__? #162

Open manulera opened 7 months ago

manulera commented 7 months ago

I see that it does not raise an error on Seq, but what should it mean on a circular molecule?

print(Seq('AAAA')[0:8])
# AAAA

print(SeqRecord(Seq('AAAA'))[0:8].seq)
# AAAA

print(Dseq('AAAA', circular=False)[0:8])
# AAAA

print(Dseq('AAAA', circular=True)[0:8])
# empty sequence

print(Dseqrecord('AAAA', circular=False)[0:8].seq)
# AAAA

print(Dseqrecord('AAAA', circular=True)[0:8].seq)
# empty sequence
BjornFJohansson commented 7 months ago

I would expect this to return the sequence or going around as much as needed. Perhaps the former as I cant think of a reason for the latter right now. I agree that the empty string is wrong.

manulera commented 7 months ago

I think the going around would be good, but it should not be allowed to be more than a full circle, I think. For example, in a circular sequence:

seq = Dseq('ACGTA')

seq[3:6] # I would expect TAA
seq[3:8] # I would expect TAACG
seq[3:10] # I would expect TAACG or an error

I think this is more similar to what one would expect from a circular string