BjornFJohansson / pydna

Clone with Python! Data structures for double stranded DNA & simulation of homologous recombination, Gibson assembly, cut & paste cloning.
Other
160 stars 40 forks source link

Improve typing of functions in 'crispr' module #215

Open dgruano opened 3 months ago

dgruano commented 3 months ago

I was playing around with the crispr module and came across a weird error where the cut coordinates of a cas9 object were way larger than the target sequence.

from pydna.dseqrecord import Dseqrecord
from pydna.crispr import cas9

guide = Dseqrecord("GTTACTTTACCCGACGTCCC")
target = Dseqrecord("GTTACTTTACCCGACGTCCCaGG")

# Create an enzyme object with the guide RNA
enzyme = cas9(str(guide.seq))

# Search for a cutsite in the target sequence
print(enzyme.search(target))  # prints [148] (should be 18)
print(len(target))  # prints 23

The problem was that I was passing a Dseqrecord object and not a string. I am not very familiar yet with the rest of pydna so do most functions require a string or a Dseq / Dseqrecord object? Should we check the input type within the functions or add type hinting?

Let me know if I can help.

BjornFJohansson commented 3 months ago

Hi and thanks for your interest in pydna. I have been busy with this years round of grant proposals, nomrally I try to respond quicker.

The crispr module right now is a minimally working example. I think the way to go here is to specify something that intuitively describes a linear ssDNA molecule. In pydna, Dseq and Dseqrecords are used for dsDNA. I think better type hinting at the least and perhaps accepting pydna.seqrecord.SeqRecord would make sense?