pinellolab / CRISPResso2

Analysis of deep sequencing data for rapid and intuitive interpretation of genome editing experiments
Other
256 stars 91 forks source link

Errors thrown when giving amplicon_seq in lower case. #396

Closed francoiskroll closed 3 months ago

francoiskroll commented 3 months ago

Hi – Could it be that issue #187 was not fully resolved?

I am using crispresso2 2.2.15.

If I run (note --amplicon_seq all in lowercase)

CRISPResso --fastq_r1 ./D11_S47_L001_R1_001.fastq --fastq_r2 ./D11_S47_L001_R2_001.fastq --amplicon_seq gtacagtctggtgtggctcataagccccattttgggttttatcctacagcccgtcatcggctcggcgagcgactactgtaggtcgtcataaggccgaaggagaccgtacatactcttactggggattctgatgttagtgggcatgactttatttctaaatggagatgcagtcacaacaggtgggtga --amplicon_name slc45a2TAA --prime_editing_pegRNA_spacer_seq gactactgtaggtcgtcata --prime_editing_pegRNA_extension_seq tctccttcggccccatgacgacctacagt --prime_editing_pegRNA_scaffold_seq gttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtgggaccgagtcggtcc

It says:

ERROR: The given prime editing pegRNA spacer is not found in the reference sequence.

But the spacer is in the reference/amplicon sequence.

The analysis runs if I use (note --amplicon_seq all in uppercase):

CRISPResso --fastq_r1 ./D11_S47_L001_R1_001.fastq --fastq_r2 ./D11_S47_L001_R2_001.fastq --amplicon_seq GTACAGTCTGGTGTGGCTCATAAGCCCCATTTTGGGTTTTATCCTACAGCCCGTCATCGGCTCGGCGAGCGACTACTGTAGGTCGTCATAAGGCCGAAGGAGACCGTACATACTCTTACTGGGGATTCTGATGTTAGTGGGCATGACTTTATTTCTAAATGGAGATGCAGTCACAACAGGTGGGTGA --amplicon_name slc45a2TAA --prime_editing_pegRNA_spacer_seq gactactgtaggtcgtcata --prime_editing_pegRNA_extension_seq tctccttcggccccatgacgacctacagt --prime_editing_pegRNA_scaffold_seq gttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtgggaccgagtcggtcc

If you find the time, can I also suggest giving an example of a command used for prime editing in the README? For example, the README give the suggested --prime_editing_pegRNA_scaffold_seq as RNA (U instead of T), which confused me somewhat. I thought I was meant to give all RNA sequences as actual RNA, while clearly the above works fine.