manulera / ShareYourCloning

A web application to generate molecular cloning strategies in json format, and share them with others.
MIT License
18 stars 3 forks source link

What to do when homology arms don't match genome sequence? #23

Open manulera opened 2 months ago

manulera commented 2 months ago

This happens in the MYTK MoClo kit, where the sequence of the homology arms of 500bp does not exactly match that of the genome, having a few mistmaches. What to do in a case like this? Include Ns?

manulera commented 2 months ago

Causes something like this: https://github.com/manulera/ShareYourCloning_frontend/issues/241

dgruano commented 2 months ago

To me, it would make sense to add the Ns (or any of the other IUPAC symbols. We could validate them later with sequencing results, like we would do for this issue.

Additionally, we could:

  1. Add a feature named "unknown recombination site"
  2. Label the construct as "unfinished" or "missing information" so the users pay attention
manulera commented 2 months ago

Yes, I agree that it makes sense to use those symbols. The issue is that sometimes there will be gaps, and that is not really supported by the letter notation (would have to be something like N or -.

Also, we would have to decide what to do with the annotation present in the plasmid or locus and how to transmit it.

I would say that this is something to think about but not prioritary to fix, since at least for the time being we can prioritise the sequence of the insert.