BjornFJohansson / pydna

Clone with Python! Data structures for double stranded DNA & simulation of homologous recombination, Gibson assembly, cut & paste cloning.
Other
166 stars 45 forks source link

Duplicated assembly products if fragments contain overlapping matches #200

Closed manulera closed 8 months ago

manulera commented 9 months ago

A bit of an edge case, but when two overlapping matches exist, we may get duplicated assemblies, or edges that represent the same join. Below a minimal example.

from pydna.common_sub_strings import common_sub_strings

for match in common_sub_strings('AATTAATTAATT', 'AATTAATTAAGG', 8):
    print(match)

# prints
# (0, 0, 10)
# (4, 0, 8)

I don't think there is much to do about it, but it's good to keep in mind that this can happen.

BjornFJohansson commented 8 months ago

I think these should be returned as two distinctive results. This is typically important in the planning stage and is sometimes due to a planning error or oversight.

AATTAATTAATT
0123456789
AATTAATTAAGG

AATTAATTAATT
    01234567
    AATTAATTAAGG
manulera commented 8 months ago

Hi @BjornFJohansson I am not sure what the point was here. Looking back at it, I don't really understand what I was trying to say.