BjornFJohansson / pydna

Clone with Python! Data structures for double stranded DNA & simulation of homologous recombination, Gibson assembly, cut & paste cloning.
Other
166 stars 45 forks source link

Simplify assembly_primer API #26

Closed BjornFJohansson closed 7 years ago

BjornFJohansson commented 8 years ago

The assembly_primers interface is overly complex. Should be cleaned up to be more intuitive.

BjornFJohansson commented 7 years ago

Assembly primer returns a list of tuples of primer pairs: [(p1,p2),(p3,p4),(p5,p6)]. This is not very practical.

Suggestion:

if cloning primers return an amplicon object: then this would return a list of amplicons: [cloning_primers(frag) for frag in fragments] where fragments is a list of Dseqrecord objects.

The Assembly primer function should take that list of Amplicon objects and return another list of Amplicon objects where the primers have tails to facilitate recombination.

This way, the primers could be processed in each step and the whole process would be more transparent and intuitive.

BjornFJohansson commented 7 years ago

New pytonic primer design functions

The new functions primer_design and assembly_fragments are available here:

from pydna.primer_design                       import primer_design
from pydna.primer_design                       import assembly_fragments

These are meant to replace the old functions cloning_primers, assembly_primers and integration_primers

from pydna.primer_design                       import cloning_primers
from pydna.primer_design                       import assembly_primers
from pydna.primer_design                       import integration_primers

The interfaces for the old functions were overly complex.

The new primer_design function can be imported like this:

from pydna.primer_design import primer_design

and takes a minimum of one and a maximum of eight arguments:

def primer_design(    template,
                      fp=None,
                      rp=None,
                      target_tm=55.0,
                      fprimerc=1000.0,  # nM
                      rprimerc=1000.0,  # nM
                      saltc=50.0,
                      formula = _tmbresluc):

The template is the only necessary argument and is meant to be a Dseqrecord or a subclass. The function will return an Amplicon object.

The Amplicon object will have two Primer class objects available as the forward_primer and reverse_primer properties. The Primer class is a subclass of the Biopython SeqRecord class. The primers are designed so that they will amplify the

The primer design is a simplification of the cloning_primers function and does away with most of the settings that regulate max and min length of primers etc. The target_tm together with the formula argument will affect the primer lengths. I have found it easier and more readable to post process primers if they need to conform to a maximum length for example.

The assembly_fragments function takes a list of Amplicons possibly with other interspersed Dseqrecord objects (or subclasses thereof).

The assembly_fragments function returns another list of Amplicons and Dseqrecords. The primerobjects will have added tails that would allow the fragments to be assembled using the Assembly class.

The assembly_fragments function takes one to three arguments:

def assembly_fragments(f, overlap=35, maxlink=40):

f is the list of Amplicons/Dseqrecords overlap is an integer that is used to set the number of bp of overlap between sequences.

Small sequences such as restriction sites can be put in between sequences and be incorporated in the primers if they are maximum maxlink long.

At least every second fragment has to be an Amplicon object since primer tails will only be put on primers belonging to Amplicons.

For example:

[amplicon1, amplicon2, amplicon3, ... ampliconN] 

[amplicon1 dseqrecord2 amplicon3, dseqrecord4, ...]