bretonics / CHOMP

Search guide-RNA sequences given a genome or gene sequence
MIT License
4 stars 1 forks source link

Rank off targets #2

Closed bretonics closed 7 years ago

bretonics commented 7 years ago

Get match hits for each CRISPR and rank matches according to how many bp are matched. Not only how many occurrences per target throughout genome but which has the least matching base pairs.

bretonics commented 7 years ago

Commit 4d3c31a14ae01417a0bde513de4e0e1d06962489 adds output with:

Occurrence in format length : matches

Length = window length (CRISPR sequence size) Matches = number of nucleotide matches in hit

bretonics commented 7 years ago

3591c4f95e010243eafd6e2b0b1351a8deb54239 and b7925f0df2a45ccf6e3e56d2c97e1e1ae7f0eda9 adds support.

Need to switch ranking priority to identities as primary sorting, then by number of occurrences.

Name    Sequence    Strand  Reverse Occurrences Identities
CRISPR_3    TGTGATCACGTACTATTATGCGG plus    GGCGTATTATCATGCACTAGTGT 3   23,8,8
CRISPR_2    AAAAATTTTCTCTATCTAACGGG minus   GGGCAATCTATCTCTTTTAAAAA 4   23,15,8,8
CRISPR_1    AAAAAATTTTCTCTATCTAACGG minus   GGCAATCTATCTCTTTTAAAAAA 4   23,16,8,8
CRISPR_8    AAAAAAAATTTTCCCTATCGGGG minus   GGGGCTATCCCTTTTAAAAAAAA 2   23,9
CRISPR_9    AAAAAAATTTTCCCTATCGGGGG minus   GGGGGCTATCCCTTTTAAAAAAA 2   23,9
CRISPR_6    CGAAAAAAAATTTTCCCTATCGG minus   GGCTATCCCTTTTAAAAAAAAGC 2   23,9
CRISPR_7    GAAAAAAAATTTTCCCTATCGGG minus   GGGCTATCCCTTTTAAAAAAAAG 2   23,9
CRISPR_4    AAAAATCCCATCGATCTAGCAGG minus   GGACGATCTAGCTACCCTAAAAA 8   23,9,7,7,7,7,7,7
CRISPR_0    ATGTAGCTAGCTAGCTAGTAGGG plus    GGGATGATCGATCGATCGATGTA 5   23,14,12,10,10
CRISPR_5    TCCCATCGATCTAGCAGGCCCGG minus   GGCCCGGACGATCTAGCTACCCT 7   23,15,9,7,7,7,7

Less base pair matches in match hit (identities) == better CRISPR, followed by fewer occurrences.

bretonics commented 7 years ago

e512c8f9993b849afe37ded7e8297f11cb45626c closes