BjornFJohansson / pydna

Clone with Python! Data structures for double stranded DNA & simulation of homologous recombination, Gibson assembly, cut & paste cloning.
Other
166 stars 45 forks source link

Fix problem with feature locations when 3' sticky ends are involved #60

Closed mountainpenguin closed 4 years ago

mountainpenguin commented 5 years ago

When adding fragments in which the 'other' sequence has a 3' sticky end at its 5' end, the features in 'other' are shifted in the wrong direction (since the ovhg is positive, not negative).

This pull request fixes this issue in dseqrecord.__add__ and in dseqrecord.looped.

I will try to create an example assembly workflow to demonstrate the problem, and how it is fixed if you would like?

EDIT: It seems I managed to include a typo in my commit message (and -> are) sorry about this!

codecov[bot] commented 5 years ago

Codecov Report

Merging #60 into master will decrease coverage by 0.29%. The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #60     +/-   ##
=========================================
- Coverage   97.84%   97.54%   -0.3%     
=========================================
  Files          28       28             
  Lines        3204     3223     +19     
  Branches      487      495      +8     
=========================================
+ Hits         3135     3144      +9     
- Misses         44       51      +7     
- Partials       25       28      +3
Impacted Files Coverage Δ
pydna/dseqrecord.py 95.42% <50%> (-2.08%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4f4e64e...f01964e. Read the comment docs.

BjornFJohansson commented 5 years ago

Thanks! Ill have a look at this ASAP, almost for sure this week.

Example code would be great.

mountainpenguin commented 5 years ago

@BjornFJohansson

I believe this script should demonstrate the issue this fixes.

It takes a toy DNA sequence: AAAAAAAAAAACTGCAGCCCCCCCCCCGAATTCGGGGGGGGGG containing a PstI site and an EcoRI site.

The sequence of As has a feature called "part1", the Cs part2, and the Gs part3. PstI leaves a 3' overhang and EcoRI leaves a 5' overhang.

If you cut with EcoRI then add the two fragments back together, the features will have the same location as at the start, but if you cut with PstI, features part2 and part3 are shifted 8 base pairs downstream.

Output:

pydna version: 3.0.2a3
Last git commit: 4f4e64e5f68e40ef21de0b2bbd7b555eb2b63240
Features prior to cutting:
+-----+---------------+-----+-----+-----+-----+------+------+
| Ft# | Label or Note | Dir | Sta | End | Len | type | orf? |
+-----+---------------+-----+-----+-----+-----+------+------+
|   0 | L:p a r t 1   | --> | 0   | 11  |  11 | misc |  no  |
|   1 | L:p a r t 2   | --> | 18  | 27  |   9 | misc |  no  |
|   2 | L:p a r t 3   | --> | 34  | 43  |   9 | misc |  no  |
+-----+---------------+-----+-----+-----+-----+------+------+
Features after reassembling with PstI
+-----+---------------+-----+-----+-----+-----+------+------+
| Ft# | Label or Note | Dir | Sta | End | Len | type | orf? |
+-----+---------------+-----+-----+-----+-----+------+------+
|   0 | L:p a r t 1   | --> | 0   | 11  |  11 | misc |  no  |
|   1 | L:p a r t 2   | --> | 26  | 35  |   9 | misc |  no  |
|   2 | L:p a r t 3   | --> | 42  | 51  |   9 | misc |  no  |
+-----+---------------+-----+-----+-----+-----+------+------+
Features after reassembling with EcoRI
+-----+---------------+-----+-----+-----+-----+------+------+
| Ft# | Label or Note | Dir | Sta | End | Len | type | orf? |
+-----+---------------+-----+-----+-----+-----+------+------+
|   0 | L:p a r t 1   | --> | 0   | 11  |  11 | misc |  no  |
|   1 | L:p a r t 2   | --> | 18  | 27  |   9 | misc |  no  |
|   2 | L:p a r t 3   | --> | 34  | 43  |   9 | misc |  no  |
+-----+---------------+-----+-----+-----+-----+------+------+

And after my change:


pydna version: 3.0.2a3
Last git commit: f01964e06c537d828dd140334c62f72c362ebe76
Features prior to cutting:
+-----+---------------+-----+-----+-----+-----+------+------+
| Ft# | Label or Note | Dir | Sta | End | Len | type | orf? |
+-----+---------------+-----+-----+-----+-----+------+------+
|   0 | L:p a r t 1   | --> | 0   | 11  |  11 | misc |  no  |
|   1 | L:p a r t 2   | --> | 18  | 27  |   9 | misc |  no  |
|   2 | L:p a r t 3   | --> | 34  | 43  |   9 | misc |  no  |
+-----+---------------+-----+-----+-----+-----+------+------+
Features after reassembling with PstI
+-----+---------------+-----+-----+-----+-----+------+------+
| Ft# | Label or Note | Dir | Sta | End | Len | type | orf? |
+-----+---------------+-----+-----+-----+-----+------+------+
|   0 | L:p a r t 1   | --> | 0   | 11  |  11 | misc |  no  |
|   1 | L:p a r t 2   | --> | 18  | 27  |   9 | misc |  no  |
|   2 | L:p a r t 3   | --> | 34  | 43  |   9 | misc |  no  |
+-----+---------------+-----+-----+-----+-----+------+------+
Features after reassembling with EcoRI
+-----+---------------+-----+-----+-----+-----+------+------+
| Ft# | Label or Note | Dir | Sta | End | Len | type | orf? |
+-----+---------------+-----+-----+-----+-----+------+------+
|   0 | L:p a r t 1   | --> | 0   | 11  |  11 | misc |  no  |
|   1 | L:p a r t 2   | --> | 18  | 27  |   9 | misc |  no  |
|   2 | L:p a r t 3   | --> | 34  | 43  |   9 | misc |  no  |
+-----+---------------+-----+-----+-----+-----+------+------+```