Edinburgh-Genome-Foundry / DnaChisel

:pencil2: A versatile DNA sequence optimizer
https://edinburgh-genome-foundry.github.io/DnaChisel/
MIT License
213 stars 38 forks source link

I couldnt find the ObjectiveEvaluation and Objective classes mentioned in the examples/non_unique_kmers_minimization.py #8

Closed harijay closed 5 years ago

harijay commented 5 years ago

I was trying to understand the example included in ".//examples/non_unique_kmers_minimization.py" and they mention two classes "ObjectiveEvaluation" and Objective. I couldnt find these anywhere in the source, can someone point me in the right direction.

Alternatively I am looking to write DNAOptimization objective that will design protein coding repeats without using the same codons for each repeat in an expression plasmid.

Any help in this will be greatly appreciated.

Zulko commented 5 years ago

Hi there,

Thanks for reporting the broken example, I have fixed it ("Objective" became "Specification" a few months ago).

I am not sure exactly what you end goal is, do you want to study different codon adaptations in your repeats, or do you simply want to minimize repeats in your sequence to avoid recombinations etc. ? In the second case, If you want to minimize kmer repeats you can use the Specification AvoidNonUniqueSegments:

from dnachisel import (AvoidNonuniqueSegments, EnforceTranslation)
from dnachisel.biotools import reverse_translate, random_protein_sequence

# We create a random protein made of a 50-amino-acid domain repeated 3 times
protein_segment = random_protein_sequence(length=50)
repeated_protein = protein_segment + protein_segment + protein_segment
repeated_protein_gene = reverse_translate(repeated_protein)

# The problem: keep the amino-acid sequence, but make sure the sequence
# uses different amino acids so that each 12-mer in the sequence is unique
problem = DnaOptimizationProblem(
    sequence=repeated_protein_gene,
    constraints=[EnforceTranslation(),
                 AvoidNonuniqueSegments(min_length=12)],
)

problem.resolve_constraints()

If the resolution fails in your case you can also pass AvoidNonuniqueSegments as objective:

problem = DnaOptimizationProblem(
    sequence=repeated_protein_gene,
    constraints=[EnforceTranslation()],
    objectives=[AvoidNonuniqueSegments()]
)

Let me know if I got it right and if that works for you.

axtambe commented 5 years ago

Hi Zulko, thanks for this tip. When I attempt to run the AvoidNonuniqueSegments method described above I get a KeyError, possibly from EnforceTranslation.py. The full error message is below. Do you know what might be causing this?

Thanks

`KeyError Traceback (most recent call last)

in 4 sequence=sequence, 5 constraints=[EnforceTranslation()], ----> 6 objectives=[AvoidNonuniqueSegments(15)] 7 ) /usr/local/lib/python3.7/site-packages/dnachisel/DnaOptimizationProblem.py in __init__(self, sequence, constraints, objectives, logger, mutation_space) 142 self.logger = logger 143 self.mutation_space = mutation_space --> 144 self.initialize() 145 146 def initialize(self): /usr/local/lib/python3.7/site-packages/dnachisel/DnaOptimizationProblem.py in initialize(self) 160 self._objectives_before = None 161 if self.mutation_space is None: --> 162 self.mutation_space = MutationSpace.from_optimization_problem(self) 163 self.sequence = self.mutation_space.constrain_sequence( 164 self.sequence) /usr/local/lib/python3.7/site-packages/dnachisel/MutationSpace.py in from_optimization_problem(problem, new_constraints) 317 if isinstance(choice, MutationChoice) 318 else MutationChoice(segment=choice[0], variants=set(choice[1])) --> 319 for cst in constraints 320 for choice in cst.restrict_nucleotides(sequence) 321 ], key=lambda choice: (choice.end - choice.start, choice.start)) /usr/local/lib/python3.7/site-packages/dnachisel/MutationSpace.py in (.0) 318 else MutationChoice(segment=choice[0], variants=set(choice[1])) 319 for cst in constraints --> 320 for choice in cst.restrict_nucleotides(sequence) 321 ], key=lambda choice: (choice.end - choice.start, choice.start)) 322 # print (new_constraints, choices_index) /usr/local/lib/python3.7/site-packages/dnachisel/builtin_specifications/EnforceTranslation.py in restrict_nucleotides(self, sequence, location) 138 self.translation[int((i - start) / 3)] 139 ])) --> 140 for i in range(start, end, 3) 141 ] 142 else: /usr/local/lib/python3.7/site-packages/dnachisel/builtin_specifications/EnforceTranslation.py in (.0) 138 self.translation[int((i - start) / 3)] 139 ])) --> 140 for i in range(start, end, 3) 141 ] 142 else: KeyError: 'X'`
Zulko commented 5 years ago

This is a weird one, my best guess is that the sequence you provide to DnaOptimizationProblem contains a X... are you providing a degenerate DNA sequence ?

axtambe commented 5 years ago

Yup, cleaning up the input string seemed to do the trick. Thanks for the prompt reply

Zulko commented 5 years ago

No problem ! Let me know if you managed to do what you wanted, so I can close this thread.

axtambe commented 5 years ago

The problem was resolved, you can close the thread

Zulko commented 5 years ago

:+1: