manulera / ShareYourCloning_backend

The backend application for ShareYourCloning
MIT License
1 stars 6 forks source link

Assemblies #121

Open manulera opened 4 months ago

manulera commented 4 months ago

Sometimes there is unexpected behaviour.

I think this can be caused by the extension of the overlap (instead of the minimal overlap, we extend to encompass as many as possible). Illustrated in the example below from the frontend (pseudocode). However, not sure if the 1[50:90]:2[0:40]', '2[43:83]:1[50:90]' should ever be returned, if 2 is not circular.

const homology1 = 'ATGCAAACAGTAATGATGGATGACATTCAAAGCACTGATT';
const homology2 = 'GTTTGCATCATTACTACCTACTGTAAGTTTCGTGACTAAA';

manuallyTypeSequence(`aaaaaa${homology1}tttt${homology1}tttttttt`);
cy.get('svg[data-testid="AddCircleIcon"]', { timeout: 20000 }).last().click();
manuallyTypeSequence(`${homology1}aaa${homology1}`);

This returns 3 assemblies:

('1[6:46]:2[0:40]', '2[40:83]:1[3:46]')
('1[6:46]:2[0:40]', '2[43:83]:1[50:90]')
('1[50:90]:2[0:40]', '2[43:83]:1[50:90]')

But it should return 6, because there could be a single

manulera commented 4 months ago

Similar problem here

hom1 = 'ACCGGTTACCGGTT'
hom2 = 'CGCGTTACGCGTTA'
template = Dseqrecord(f'aaa{hom1}AATTGGAA{hom2}tttt')
insert = Dseqrecord(f'{hom1}AAA{hom2}')

asm = Assembly((template, insert), limit=14, use_all_fragments=True)

for a in asm.get_insertion_assemblies():
    print(assembly2str(a))

The problem here is that homology regions on the left and right overlap (see the x):

temp      aaaACCGGTTACCGGTTAATTGGAACGCGTTACGCGTTAtttt
           |||||||||||||||||x
insr         ACCGGTTACCGGTTAAACGCGTTACGCGTTA
                            x|||||||||||||||
temp aaaACCGGTTACCGGTTAATTGGAACGCGTTACGCGTTAtttt

This confuses format_insertion_assembly, which ends up returning wrong one as valid, and skipping right

>> ('1[3:19]:2[0:16]', '2[15:31]:1[23:39]')
Discarded but should be kept
>> ('1[23:39]:2[15:31]', '2[0:16]:1[3:19]')
Kept but should be discarded, turned into '2[0:16]:1[3:19]', '1[23:39]:2[15:31]')