pinellolab / CRISPResso2

Analysis of deep sequencing data for rapid and intuitive interpretation of genome editing experiments
Other
278 stars 95 forks source link

quantification window ERROR and invalid characters ERROR when using prime-edit #202

Open Masterchiefm opened 2 years ago

Masterchiefm commented 2 years ago

Describe the bug I encounter tow type of errors while using CRISPResso2. The ERRORs are quantification window ERROR and invalid characters ERROR

I upload all the fastq.gz file here

Error in The quantification window

The first time I run this command, and it returned such ERROR:

#cmommand I used:
CRISPResso -r1 VIP-3-6_L1_1.fq.gz -r2 VIP-3-6_L1_2.fq.gz -a tggttttcgctccgaagGTAAAAGAAATCATTGAGTCCCCCGCCTTCAGAAGAGGGTGCATTTTCAGGAGGAAGCGATGGCTTCAGACAGCATATTTGAGTCATTTCCTTCGTACCCACAGTGCTTCATGAGAGgtgagtacatgctggtcttgtaatatctacttttgctcagctttgcctgtaatgaaatggcagcttgtttcacctcggtgcagagatgcctcggtgcctgccagttccctg --prime_editing_pegRNA_spacer_seq GCATTTTCAGGAGGAAGCGA --prime_editing_pegRNA_extension_seq tgtctgaagccatcgcatcttcctcctgaaaat --prime_editing_nicking_guide_seq ATGAAGCACTGTGGGTACGA
#ERROR I got:
The quantification window has been partially exluded by the --exclude_bp_from_left or --exclude_bp_from_right parameters.

Then I followed the ERROR output and set --exclude_bp_from_left 1 --exclude_bp_from_right 1, but it didn't work. The ERROR is still the same.

Trying to alter the quantification window

I wondered if the default quantification window isn't fit to my Prime Edit pattern, So I tried to change them using the following parameters:

--quantification_window_size 1 (default)
--prime_editing_pegRNA_extension_quantification_window_size 20
--plot_window_size  25

If I understand the README file correctly, my setting should be like this:

image

so, I run this command. But I still got ERROR in the window.

# cmommand:
 CRISPResso -r1 VIP-3-6_L1_1.fq.gz -r2 VIP-3-6_L1_2.fq.gz -a tggttttcgctccgaagGTAAAAGAAATCATTGAGTCCCCCGCCTTCAGAAGAGGGTGCATTTTCAGGAGGAAGCGATGGCTTCAGACAGCATATTTGAGTCATTTCCTTCGTACCCACAGTGCTTCATGAGAGgtgagtacatgctggtcttgtaatatctacttttgctcagctttgcctgtaatgaaatggcagcttgtttcacctcggtgcagagatgcctcggtgcctgccagttccctg --prime_editing_pegRNA_spacer_seq GCATTTTCAGGAGGAAGCGA --prime_editing_pegRNA_extension_seq tgtctgaagccatcgcatcttcctcctgaaaat --prime_editing_nicking_guide_seq ATGAAGCACTGTGGGTACGA  --exclude_bp_from_left 1 --exclude_bp_from_right 1 --quantification_window_size 1  --prime_editing_pegRNA_extension_quantification_window_size 20  --plot_window_size 25

# ERROR:
Offset around cut would extend to the left of the amplicon. Please decrease plot_window_size parameter. Cut point: -3 window: 25 reference: 245 

I tried multiple times and modified every parameters, but didn't work. The ERROR was always the plot_window or the qualification_window. I wanna know which parameter is wrong, and how to solve it.

Debug output command:

 CRISPResso -r1 VIP-3-6_L1_1.fq.gz -r2 VIP-3-6_L1_2.fq.gz -a tggttttcgctccgaagGTAAAAGAAATCATTGAGTCCCCCGCCTTCAGAAGAGGGTGCATTTTCAGGAGGAAGCGATGGCTTCAGACAGCATATTTGAGTCATTTCCTTCGTACCCACAGTGCTTCATGAGAGgtgagtacatgctggtcttgtaatatctacttttgctcagctttgcctgtaatgaaatggcagcttgtttcacctcggtgcagagatgcctcggtgcctgccagttccctg --prime_editing_pegRNA_spacer_seq GCATTTTCAGGAGGAAGCGA --prime_editing_pegRNA_extension_seq tgtctgaagccatcgcatcttcctcctgaaaat --prime_editing_nicking_guide_seq ATGAAGCACTGTGGGTACGA  --exclude_bp_from_left 1 --exclude_bp_from_right 1 --quantification_window_size 1  --prime_editing_pegRNA_extension_quantification_window_size 20  --plot_window_size  25 --debug

output:


                               ~~~CRISPResso 2~~~                               
        -Analysis of genome editing outcomes from deep sequencing data-         

                                        _                                       
                                       '  )                                     
                                       .-'                                      
                                      (____                                     
                                   C)|     \                                    
                                     \     /                                    
                                      \___/                                     

                           [CRISPResso version 2.2.6]                           
[Note that starting in version 2.1.0 insertion quantification has been changed
to only include insertions completely contained by the quantification window.
To use the legacy quantification method (i.e. include insertions directly adjacent
to the quantification window) please use the parameter --use_legacy_insertion_quantification]
                 [For support contact kclement@mgh.harvard.edu]                 

WARNING @ Thu, 24 Feb 2022 21:20:26:
         Folder CRISPResso_on_VIP-3-6_L1_1_VIP-3-6_L1_2 already exists. 

Traceback (most recent call last):
  File "/home/moqiqin/miniconda3/envs/deepseq/lib/python3.9/site-packages/CRISPResso2/CRISPRessoCORE.py", line 1384, in main
    this_exclude_idxs) = CRISPRessoShared.get_amplicon_info_for_guides(this_seq, this_guides, this_guide_mismatches, this_guide_names, this_guide_qw_centers,
  File "/home/moqiqin/miniconda3/envs/deepseq/lib/python3.9/site-packages/CRISPResso2/CRISPRessoShared.py", line 1063, in get_amplicon_info_for_guides
    raise BadParameterException('Offset around cut would extend to the left of the amplicon. Please decrease plot_window_size parameter. Cut point: ' + str(cut_p) + ' window: ' + str(window_around_cut) + ' reference: ' + str(ref_seq_length))
CRISPResso2.CRISPRessoShared.BadParameterException: Offset around cut would extend to the left of the amplicon. Please decrease plot_window_size parameter. Cut point: -3 window: 25 reference: 245
CRITICAL @ Thu, 24 Feb 2022 21:20:26:
         Traceback (most recent call last):
  File "/home/moqiqin/miniconda3/envs/deepseq/lib/python3.9/site-packages/CRISPResso2/CRISPRessoCORE.py", line 1384, in main
    this_exclude_idxs) = CRISPRessoShared.get_amplicon_info_for_guides(this_seq, this_guides, this_guide_mismatches, this_guide_names, this_guide_qw_centers,
  File "/home/moqiqin/miniconda3/envs/deepseq/lib/python3.9/site-packages/CRISPResso2/CRISPRessoShared.py", line 1063, in get_amplicon_info_for_guides
    raise BadParameterException('Offset around cut would extend to the left of the amplicon. Please decrease plot_window_size parameter. Cut point: ' + str(cut_p) + ' window: ' + str(window_around_cut) + ' reference: ' + str(ref_seq_length))
CRISPResso2.CRISPRessoShared.BadParameterException: Offset around cut would extend to the left of the amplicon. Please decrease plot_window_size parameter. Cut point: -3 window: 25 reference: 245

CRITICAL @ Thu, 24 Feb 2022 21:20:26:
         Parameter error, please check your input.

ERROR: Offset around cut would extend to the left of the amplicon. Please decrease plot_window_size parameter. Cut point: -3 window: 25 reference: 245 

Alphabet error

I was using this command, but the ERROR was different.

I AM SURE there was NO invalid character in my sequence, but the results is always the same, I don't know where I did wrong.

CRISPResso version 2.2.6
[Command used]:
/home/moqiqin/miniconda3/envs/deepseq/bin/CRISPResso -r1 VIP-3-9_L1_1.fq.gz -r2 VIP-3-9_L1_2.fq.gz -a tggttttcgctccgaagGTAAAAGAAATCATTGAGTCCCCCGCCTTCAGAAGAGGGTGCATTTTCAGGAGGAAGCGATGGCTTCAGACAGCATATTTGAGTCATTTCCTTCGTACCCACAGTGCTTCATGAGAGgtgagtacatgctggtcttgtaatatctacttttgctcagctttgcctgtaatgaaatggcagcttgtttcacctcggtgcagagatgcctcggtgcctgccagttccctg --prime_editing_pegRNA_spacer_seq GCATTTTCAGGAGGAAGCGA --prime_editing_pegRNA_extension_seq tgtctgaagccacttcctcctgaaaat --prime_editing_nicking_guide_seq ATGAAGCACTGTGGGTACGA -o results/20211228-42-RUNX1_in_VIP-3-9

[Execution log]:
Alphabet error, please check your input.

ERROR: Reference amplicon sequence 1 (Prime-edited) contains invalid characters: -

Debug output command:

CRISPResso -r1 VIP-3-9_L1_1.fq.gz -r2 VIP-3-9_L1_2.fq.gz -a tggttttcgctccgaagGTAAAAGAAATCATTGAGTCCCCCGCCTTCAGAAGAGGGTGCATTTTCAGGAGGAAGCGATGGCTTCAGACAGCATATTTGAGTCATTTCCTTCGTACCCACAGTGCTTCATGAGAGgtgagtacatgctggtcttgtaatatctacttttgctcagctttgcctgtaatgaaatggcagcttgtttcacctcggtgcagagatgcctcggtgcctgccagttccctg --prime_editing_pegRNA_spacer_seq GCATTTTCAGGAGGAAGCGA --prime_editing_pegRNA_extension_seq tgtctgaagccacttcctcctgaaaat --prime_editing_nicking_guide_seq ATGAAGCACTGTGGGTACGA -o results/20211228-42-RUNX1_in_VIP-3-9 --debug

output:

                               ~~~CRISPResso 2~~~                               
        -Analysis of genome editing outcomes from deep sequencing data-         

                                        _                                       
                                       '  )                                     
                                       .-'                                      
                                      (____                                     
                                   C)|     \                                    
                                     \     /                                    
                                      \___/                                     

                           [CRISPResso version 2.2.6]                           
[Note that starting in version 2.1.0 insertion quantification has been changed
to only include insertions completely contained by the quantification window.
To use the legacy quantification method (i.e. include insertions directly adjacent
to the quantification window) please use the parameter --use_legacy_insertion_quantification]
                 [For support contact kclement@mgh.harvard.edu]                 

WARNING @ Thu, 24 Feb 2022 21:22:09:
         Folder /home/moqiqin/20220208/VIP-3/results/20211228-42-RUNX1_in_VIP-3-9/CRISPResso_on_VIP-3-9_L1_1_VIP-3-9_L1_2 already exists. 

Traceback (most recent call last):
  File "/home/moqiqin/miniconda3/envs/deepseq/lib/python3.9/site-packages/CRISPResso2/CRISPRessoCORE.py", line 1133, in main
    raise CRISPRessoShared.NTException('Reference amplicon sequence %d (%s) contains invalid characters: %s'%(idx, this_name, ' '.join(wrong_nt)))
CRISPResso2.CRISPRessoShared.NTException: Reference amplicon sequence 1 (Prime-edited) contains invalid characters: -
CRITICAL @ Thu, 24 Feb 2022 21:22:09:
         Traceback (most recent call last):
  File "/home/moqiqin/miniconda3/envs/deepseq/lib/python3.9/site-packages/CRISPResso2/CRISPRessoCORE.py", line 1133, in main
    raise CRISPRessoShared.NTException('Reference amplicon sequence %d (%s) contains invalid characters: %s'%(idx, this_name, ' '.join(wrong_nt)))
CRISPResso2.CRISPRessoShared.NTException: Reference amplicon sequence 1 (Prime-edited) contains invalid characters: -

CRITICAL @ Thu, 24 Feb 2022 21:22:09:
         Alphabet error, please check your input.

ERROR: Reference amplicon sequence 1 (Prime-edited) contains invalid characters: - 
tfguinan commented 1 year ago

Hi, sorry I can't be much help regarding the quantification window error. However, I encountered something similar to the alphabet error, and got around it by changing the sequence from a mix of cases to all uppercase (eg. cgaagGTAAA to CGAAGGTAAA).

Masterchiefm commented 1 year ago

这是来自QQ邮箱的自动回复   您好,已收到邮件。Mail received