Edinburgh-Genome-Foundry / DnaChisel

:pencil2: A versatile DNA sequence optimizer
https://edinburgh-genome-foundry.github.io/DnaChisel/
MIT License
213 stars 38 forks source link

AvoidPattern with biopython FeatureLocation raises AttributeError (in some specific parameters) #51

Closed ghost closed 3 years ago

ghost commented 3 years ago

Adding a simple DNA pattern TGAA as an AvoidPattern constraint causes an AttributeError, the stack trace is listed below

"/usr/local/lib/python3.8/site-packages/dnachisel/DnaOptimizationProblem/mixins/ConstraintsSolverMixin.py", line 357, in resolve_constraints
    self.resolve_constraint(constraint=constraint)
  File "/usr/local/lib/python3.8/site-packages/dnachisel/DnaOptimizationProblem/mixins/ConstraintsSolverMixin.py", line 251, in resolve_constraint
    evaluation = this_local_constraint.evaluate(self)
AttributeError: 'NoneType' object has no attribute 'evaluate'
veghp commented 3 years ago

Please provide a reproducible example. This modified readme example works for me:

from dnachisel import *

# DEFINE THE OPTIMIZATION PROBLEM
#~ seq="AAAAAAA"
seq="ATGAAA"
seq

problem = DnaOptimizationProblem(
    sequence=seq,
    constraints=[
        AvoidPattern("TGAA"),
    ],
)

print(problem.constraints_text_summary())
# ===> FAILURE: 1 constraints evaluations failed
# FAIL ┍ AvoidPattern[0-6](pattern:TGAA)
#      │ Failed. Pattern found at positions [1-5(+)]

# SOLVE THE CONSTRAINTS, OPTIMIZE WITH RESPECT TO THE OBJECTIVE
problem.resolve_constraints()

# PRINT SUMMARIES TO CHECK THAT CONSTRAINTS PASS
print(problem.constraints_text_summary())
# ===> SUCCESS - all constraints evaluations pass
# ✔PASS ┍ AvoidPattern[0-6](pattern:TGAA)
#      │ Passed. Pattern not found !

With Python 3.6 (on Ubuntu).

ghost commented 3 years ago

Sorry for late reply

from dnachisel import *
from Bio.SeqFeature import FeatureLocation

seq="TTAATGATGAAAGTTACCAG"

problem = DnaOptimizationProblem(
    sequence=seq,
    constraints=[
        AvoidPattern("TGAA", location=FeatureLocation(start=1,end=4)),
    ],
)

print(problem.constraints_text_summary())
===> FAILURE: 1 constraints evaluations failed
 FAIL ┍ AvoidPattern[1-4](pattern:TGAA)
      │ Failed. Pattern found at positions [7-11(+)]

problem.resolve_constraints()
/usr/local/lib/python3.8/site-packages/dnachisel/DnaOptimizationProblem/mixins/ConstraintsSolverMixin.py in resolve_constraints(self, final_check, cst_filter)
    355         ):
    356             try:
--> 357                 self.resolve_constraint(constraint=constraint)
    358             except NoSolutionError as error:
    359                 self.logger(constraint__index=len(constraints))

/usr/local/lib/python3.8/site-packages/dnachisel/DnaOptimizationProblem/mixins/ConstraintsSolverMixin.py in resolve_constraint(self, constraint)
    249                         new_location, problem=self
    250                     )
--> 251                 evaluation = this_local_constraint.evaluate(self)
    252
    253                 # MAYBE THE LOCAL BREACH WAS ALREADY RESOLVED AS A SIDE EFFECT

AttributeError: 'NoneType' object has no attribute 'evaluate'
Zulko commented 3 years ago

Haven't tried this but I believe there is an issue here: AvoidPattern("TGAA", location=FeatureLocation(start=1,end=4)) which should actually use DnaChisel's Location and not biopython's FeatureLocation: AvoidPattern("TGAA", location=Location(1, 4)). Not sure if that solves the problem?

ghost commented 3 years ago

thanks, I used DnaChisel's Location, it worked! Though, I don't know why it worked because in the AvoidPattern.__init__, it seems it accepts FeatureLocation where it parses it self.location = Location.from_data(location) and convert it into DnaChisel's Location

veghp commented 3 years ago

Glad it works! Regarding Location: The data structure is similar to a Biopython's FeatureLocation, but with different methods for different purposes.

Zulko commented 3 years ago

My bad, you are right @FadiBakoura, the FeatureLocation should have been transformed into a valid Location and this is a very bad bug (which could also affect DnaOptimizationProblems imported from a Biopython record!). I reopened and renamed the issue.

This snippet fixes the script originally posted by @FadiBakoura:

feature_location = FeatureLocation(start=1,end=4)
if feature_location.strand is None:
     feature_location.strand = 0 # <-----

@veghp it seems that Biopython accepts 4 values for the strand: +1, -1, 0 and None (which is the default for strand). However, strand=None causes a bug in dnachisel. So strand=None has to be transformed into strand=0 at some point, for instance in from_biopython_location.

veghp commented 3 years ago

Yes this is related to issue https://github.com/Edinburgh-Genome-Foundry/DnaChisel/issues/52

Zulko commented 3 years ago

Sorry I clicked too soon. Just updated my comment above :arrow_up: . This is a real bug I believe.

veghp commented 3 years ago

Thanks for pointing out the problem; I'll have to look into it to keep Biopython support for older version (as did with DNAalphabet). Alternatively we can drop support (?)

Zulko commented 3 years ago

I have to admit I have no clue what Biopython version supports what values :( . I think replacing None strands by 0 should be pretty safe (someone will scream if it isn't :smile: )

veghp commented 3 years ago

Thanks for the solution and see discussion in https://github.com/Edinburgh-Genome-Foundry/DnaChisel/issues/52