pinellolab / CRISPResso2

Analysis of deep sequencing data for rapid and intuitive interpretation of genome editing experiments
Other
271 stars 94 forks source link

larger amplicon analysis #215

Closed weiss480 closed 1 year ago

weiss480 commented 2 years ago

Is your feature request related to a problem? Please describe. I would like to be able to run CRISPResso2 on amplicon ranging from 1,000 - 2,000 bp

kclem commented 2 years ago

Hi @weiss480

How long are your reads? Are you doing amplicon sequencing across the whole 1000-2000bp or are you sequencing fragments of the 1000-200bp region?

weiss480 commented 2 years ago

I am sequencing nanopore reads that are between 1000-2000 bp. I want to sequence across the whole read

On Mon, Apr 25, 2022 at 3:25 PM Kendell Clement @.***> wrote:

Hi @weiss480 https://github.com/weiss480

How long are your reads? Are you doing amplicon sequencing across the whole 1000-2000bp or are you sequencing fragments of the 1000-200bp region?

— Reply to this email directly, view it on GitHub https://github.com/pinellolab/CRISPResso2/issues/215#issuecomment-1109005950, or unsubscribe https://github.com/notifications/unsubscribe-auth/AURCY5A3IG6MYGBS2NP3QFDVG35SNANCNFSM5UJQXDPA . You are receiving this because you were mentioned.Message ID: @.***>

--

Trevor Weiss

PhD Candidate

Department of Plant and Microbial Biology

Zhang Laboratory http://zhanglabumn.net

Centers for Genome Engineering https://cge.umn.edu and Precision Plant Genomics https://cppg.umn.edu

Phone: (515) 509 0193

skqxys commented 2 years ago

Hi,

I tried to run CRISPResso2 for Pacbio amplicon sequencing reads around 3K.

This was my command: CRISPResso -r1 Z-H.0_0.fastq.gz -a TCACTGACTAACCCCGGAACCACACAGCTTCCCGTTCTCAGCTCCACAAACTTGGTGCCAAATTCTTCTCCCCTGGGAAGCATCCCTGGACACTTCCCAAAGGACCCCAGTCACTCCAGCCTGTTGGCTGCCGCTCACTTTGATGTCTGCAGGCCAGATGAGGGCTCCAGATGGCACATTGTCAGAGGGACACACTGTGGCCCCTGTGCCCAGCCCTGGGCTCTCTGTACATGAAGCAACTCCAGTCCCAAATATGTAGCTGTTTGGGAGGTCAGAAATAGGGGGTCCAGGAGCAAACTCCCCCCACCCCCTTTCCAAAGCCCATTCCCTCTTTAGCCAGAGCCGGGGTGTGCAGACGGCAGTCACTAGGGGGCGCTCGGCCACCACAGGGAAGCTGGGTGAATGGAGCGAGCAGCGTCTTCGAGAGTGAGGACGTGTGTGTCTGTGTGGGTGAGTGAGTGTGTGCGTGTGGGGTTGAGGGTGTTGGAGCGGGGAGAAGGCCAGGGGTCACTCCAGGATTCCAATAGATCTGTGTGTCCCTCTCCCCACCCGTCCCTGTCCGGCTCTCCGCCTTCCCCTGCCCCCTTCAATATTCCTAGCAAAGAGGGAACGGCTCTCAGGCCCTGTCCGCACGTAACCTCACTTTCCTGCTCCCTCCTCGCCAATGCCCCGCGGGCGCGTGTCTCTGGACAGAGTTTCCGGGGGCGGATGGGTAATTTTCAGGCTGTGAACCTTGGTGGGGGTCGAGCTTCCCCTTCATTGCGGCGGGCTGCGGGCCAGGCTTCACTGAGCGTCCGCAGAGCCCGGGCCCGAGCCGCGTGTGGAAGGGCTGAGGCTCGCCTGTCCCCGCCCCCCGGGGCGGGCCGGGGGCGGGGTCCCGGCGGGGCGGAGCCATGCGCCCCCCCCTTTTTTTTTTAAAAGTCGGCTGGTAGCGGGGAGGATCGCGGAGGCTTGGGGCAGCCGGGTAGCTCGGAGGTCGTGGCGCTGGGGGCTAGCACCAGCGCTCTGTCGGGAGGCGCAGCGGTTAGGTGGACCGGTCAGCGGACTCACCGGCCAGGGCGCTCGGTGCTGGAATTTGATATTCATTGATCCGGGTTTTATCCCTCTTCTTTTTTCTTAAACATTTTTTTTTAAAACTGTATTGTTTCTCGTTTTAATTTATTTTTGCTTGCCATTCCCCACTTGAATCGGGCCGACGGCTTGGGGAGATTGCTCTACTTCCCCAAATCACTGTGGATTTTGGAAACCAGCAGAAAGAGGAAAGAGGTAGCAAGAGCTCCAGAGAGAAGTCGAGGAAGAGAGAGACGGGGTCAGAGAGAGCGCGCGGGCGTGCGAGCAGCGAAAGCGACAGGGGCAAAGTGAGTGACCTGCTTTTGGGGGTGACCGCCGGAGCGCGGCGTGAGCCCTCCCCCTTGGGATCCCGCAGCTGACCAGTCGCGCTGACGGACAGACAGACAGACACCGCCCCCAGCCCCAGCTACCACCTCCTCCCCGGCCGGCGGCGGACAGTGGACGCGGCGGCGAGCCGCGGGCAGGGGCCGGAGCCCGCGCCCGGAGGCGGGGTGGAGGGGGTCGGGGCTCGCGGCGTCGCACTGAAACTTTTCGTCCAACTTCTGGGCTGTTCTCGCTTCGGAGGAGCCGTGGTCCGCGCGGGGGAAGCCGAGCCGAGCGGAGCCGCGAGAAGTGCTAGCTCGGGCCGGGAGGAGCCGCAGCCGGAGGAGGGGGAGGAGGAAGAAGAGAAGGAAGAGGAGAGGGGGCCGCAGTGGCGACTCGGCGCTCGGAAGCCGGGCTCATGGACGGGTGAGGCGGCGGTGTGCGCAGACAGTGCTCCAGCCGCGCGCGCTCCCCAGGCCCTGGCCCGGGCCTCGGGCCGGGGAGGAAGAGTAGCTCGCCGAGGCGCCGAGGAGAGCGGGCCGCCCCACAGCCCGAGCCGGAGAGGGAGCGCGAGCCGCGCCGGCCCCGGTCGGGCCTCCGAAACCATGAACTTTCTGCTGTCTTGGGTGCATTGGAGCCTTGCCTTGCTGCTCTACCTCCACCATGCCAAGGTAAGCGGTCGTGCCCTGCTGGCGCCGCGGGCCGCTGCGAGCGCCTCTCCCGGCTGGGGACGTGCGTGCGAGCGCGCGCGTGGGGGCTCCGTGCCCCACGCGGGTCCATGGGCACCAGGCGTGCGGCGTCCCCCTCTGTCGTCTTAGGTGCAGGGGGAGGGGGCGCGCGCGCTAGGTGGGAGGGTACCCGGAGAGAGGCTCACCGCCCACGCGGGCCCTGCCCACCCACCGGAGTCACCGCACGTACGATCTGGGCCGACCAGCCGAGGGCGGGAGCCGGAGGAGGAGGCCGAGGGGGCTGGGCTTGCGTTGCCGCTGCCGGCTGAAGTTTGCTCCCGGCCGCTGGTCCCGGACGAACTGGAAGTCTGAGCAGCGGGGGCGGGAGCCAGAGACCAGTGGGCAGGGGGTGCTCGGACCTTGGACCGCGGGAGGGCAGAGAGCGTGGAGGGGGCAGGGCGCAGGAGGGAGAGGGGGCTTGCTGTCACTGCCACTCGGTCTCTTCAGCCCTCGCCGCGAGTTTGGGAAAAGTTTTGGGGTGGATTGCTGCGGGGACCCCCCCTCCCTGCTGGGCCACCTGCGCCGCGCCAACCCCGCCCGTCCCCGCTCGCGTCCCGCTCGGTGCCCGCCCTCCCCCGCCCGGCCGGGTGCGCGCGGCGCGGAGCCGATTACATCAGCCCGGGCCTGGCCGGCCGCGTGTTCCCGGAGCCTCGGCTGCCCGAATGGGGAGCCCAGAGTGGCGAGCGGCACCCCTCCCCCCGCCAGCCCTCCGCGGGAAGGTGACCTCTCGAGGTAGCCCCAGCCCGGGGATCCAGAGAACCATCCCTACCCCTTCCTACTGTCTCCAGACCCTACCTCTGCCCAGTGCTAGGAGGAATTTCCTGACGCCCCTTCTCTTCACCCATTTCCTTTTTAGCCTGGAGAGAAGCCCCTGTCACCCCGCTTATTTTCATTTCTCTCTGCGGAGAAGATCCATCTAACCCCTTTCTGGCCCCAGAGTCCAGGGAAAGGATGATCACTGTCAGAAGTCGTGGC --prime_editing_override_prime_edited_ref_seq TCACTGACTAACCCCGGAACCACACAGCTTCCCGTTCTCAGCTCCACAAACTTGGTGCCAAATTCTTCTCCCCTGGGAAGCATCCCTGGACACTTCCCAAAGGACCCCAGTCACTCCAGCCTGTTGGCTGCCGCTCACTTTGATGTCTGCAGGCCAGATGAGGGCTCCAGATGGCACATTGTCAGAGGGACACACTGTGGCCCCTGTGCCCAGCCCTGGGCTCTCTGTACATGAAGCAACTCCAGTCCCAAATATGTAGCTGTTTGGGAGGTCAGAAATAGGGGGTCCAGGAGCAAACTCCCCCCACCCCCTTTCCAAAGCCCATTCCCTCTTTAGCCAGAGCCGGGGTGTGCAGACGGCAGTCACTAGGGGGCGCTCGGCCACCACAGGGAAGCTGGGTGAATGGAGCGAGCAGCGTCTTCGAGAGTGAGGACGTGTGTGTCTGTGTGGGTGAGTGAGTGTGTGCGTGTGGGGTTGAGGGTGTTGGAGCGGGGAGAAGGCCAGGGGTCACTCAGTAACCCGGAGCTCTAATAGCCAGAGCCGGGGTGTGCAGACGGCAGTCACTAGGGGGCGCTCGGCCACCACAGGGAAGCTGGGTGAATGGAGCGAGCAGCGTCTTCGAGAGTGAGGACGTGTGTGTCTGTGTGGGTGAGTGAGTGTGTGCGTGTGGGGTTGAGGGTGTTGGAGCGGGGAGAAGGCCAGGGGTCACTCCAGGATTCCAATAGATCTGTGTGTCCCTCTCCCCACCCGTCCCTGTCCGGCTCTCCGCCTTCCCCTGCCCCCTTCAATATTCCTAGCAAAGAGGGAACGGCTCTCAGGCCCTGTCCGCACGTAACCTCACTTTCCTGCTCCCTCCTCGCCAATGCCCCGCGGGCGCGTGTCTCTGGACAGAGTTTCCGGGGGCGGATGGGTAATTTTCAGGCTGTGAACCTTGGTGGGGGTCGAGCTTCCCCTTCATTGCGGCGGGCTGCGGGCCAGGCTTCACTGAGCGTCCGCAGAGCCCGGGCCCGAGCCGCGTGTGGAAGGGCTGAGGCTCGCCTGTCCCCGCCCCCCGGGGCGGGCCGGGGGCGGGGTCCCGGCGGGGCGGAGCCATGCGCCCCCCCCTTTTTTTTTTAAAAGTCGGCTGGTAGCGGGGAGGATCGCGGAGGCTTGGGGCAGCCGGGTAGCTCGGAGGTCGTGGCGCTGGGGGCTAGCACCAGCGCTCTGTCGGGAGGCGCAGCGGTTAGGTGGACCGGTCAGCGGACTCACCGGCCAGGGCGCTCGGTGCTGGAATTTGATATTCATTGATCCGGGTTTTATCCCTCTTCTTTTTTCTTAAACATTTTTTTTTAAAACTGTATTGTTTCTCGTTTTAATTTATTTTTGCTTGCCATTCCCCACTTGAATCGGGCCGACGGCTTGGGGAGATTGCTCTACTTCCCCAAATCACTGTGGATTTTGGAAACCAGCAGAAAGAGGAAAGAGGTAGCAAGAGCTCCAGAGAGAAGTCGAGGAAGAGAGAGACGGGGTCAGAGAGAGCGCGCGGGCGTGCGAGCAGCGAAAGCGACAGGGGCAAAGTGAGTGACCTGCTTTTGGGGGTGACCGCCGGAGCGCGGCGTGAGCCCTCCCCCTTGGGATCCCGCAGCTGACCAGTCGCGCTGACGGACAGACAGACAGACACCGCCCCCAGCCCCAGCTACCACCTCCTCCCCGGCCGGCGGCGGACAGTGGACGCGGCGGCGAGCCGCGGGCAGGGGCCGGAGCCCGCGCCCGGAGGCGGGGTGGAGGGGGTCGGGGCTCGCGGCGTCGCACTGAAACTTTTCGTCCAACTTCTGGGCTGTTCTCGCTTCGGAGGAGCCGTGGTCCGCGCGGGGGAAGCCGAGCCGAGCGGAGCCGCGAGAAGTGCTAGCTCGGGCCGGGAGGAGCCGCAGCCGGAGGAGGGGGAGGAGGAAGAAGAGAAGGAAGAGGAGAGGGGGCCGCAGTGGCGACTCGGCGCTCGGAAGCCGGGCTCATGGACGGGTGAGGCGGCGGTGTGCGCAGACAGTGCTCCAGCCGCGCGCGCTCCCCAGGCCCTGGCCCGGGCCTCGGGCCGGGGAGGAAGAGTAGCTCGCCGAGGCGCCGAGGAGAGCGGGCCGCCCCACAGCCCGAGCCGGAGAGGGAGCGCGAGCCGCGCCGGCCCCGGTCGGGCCTCCGAAACCATGAACTTTCTGCTGTCTTGGGTGCATTGGAGCCTTGCCTTGCTGCTCTACCTCCACCATGCCAAGGTAAGCGGTCGTGCCCTGCTGGCGCCGCGGGCCGCTGCGAGCGCCTCTCCCGGCTGGGGACGTGCGTGCGAGCGCGCGCGTGGGGGCTCCGTGCCCCACGCGGGTCCATGGGCACCAGGCGTGCGGCGTCCCCCTCTGTCGTCTTAGGTGCAGGGGGAGGGGGCGCGCGCGCTAGGTGGGAGGGTACCCGGAGAGAGGCTCACCGCCCACGCGGGCCCTGCCCACCCACCGGAGTCACCGCACGTACGATCTGGGCCGACCAGCCGAGGGCGGGAGCCGGAGGAGGAGGCCGAGGGGGCTGGGCTTGCGTTGCCGCTGCCGGCTGAAGTTTGCTCCCGGCCGCTGGTCCCGGACGAACTGGAAGTCTGAGCAGCGGGGGCGGGAGCCAGAGACCAGTGGGCAGGGGGTGCTCGGACCTTGGACCGCGGGAGGGCAGAGAGCGTGGAGGGGGCAGGGCGCAGGAGGGAGAGGGGGCTTGCTGTCACTGCCACTCGGTCTCTTCAGCCCTCGCCGCGAGTTTGGGAAAAGTTTTGGGGTGGATTGCTGCGGGGACCCCCCCTCCCTGCTGGGCCACCTGCGCCGCGCCAACCCCGCCCGTCCCCGCTCGCGTCCCGCTCGGTGCCCGCCCTCCCCCGCCCGGCCGGGTGCGCGCGGCGCGGAGCCGATTACATCAGCCCGGGCCTGGCCGGCCGCGTGTTCCCGGAGCCTCGGCTGCCCGAATGGGGAGCCCAGAGTGGCGAGCGGCACCCCTCCCCCCGCCAGCCCTCCGCGGGAAGGTGACCTCTCGAGGTAGCCCCAGCCCGGGGATCCAGAGAACCATCCCTACCCCTTCCTACTGTCTCCAGACCCTACCTCTGCCCAGTGCTAGGAGGAATTTCCTGACGCCCCTTCTCTTCACCCATTTCCTTTTTAGCCTGGAGAGAAGCCCCTGTCACCCCGCTTATTTTCATTTCTCTCTGCGGAGAAGATCCATCTAACCCCTTTCTGGCCCCAGAGTCCAGGGAAAGGATGATCACTGTCAGAAGTCGTGGC -o Z-H.0_0_results_1 --debug

But I encountered this error:

CRITICAL @ Tue, 17 May 2022 21:42:22: Traceback (most recent call last): File "/home/pan/miniconda3/lib/python3.9/site-packages/CRISPResso2/CRISPRessoCORE.py", line 1947, in main aln_stats = process_fastq(processed_output_filename, variantCache, ref_names, refs, args) File "/home/pan/miniconda3/lib/python3.9/site-packages/CRISPResso2/CRISPRessoCORE.py", line 442, in process_fastq new_variant = get_new_variant_object(args, fastq_seq, refs, ref_names, aln_matrix, pe_scaffold_dna_info) File "/home/pan/miniconda3/lib/python3.9/site-packages/CRISPResso2/CRISPRessoCORE.py", line 228, in get_new_variant_object fws1, fws2, fwscore=CRISPResso2Align.global_align(fastq_seq, refs[ref_name]['sequence'], matrix=aln_matrix, gap_incentive=refs[ref_name]['gap_incentive'], gap_open=args.needleman_wunsch_gap_open, gap_extend=args.needleman_wunsch_gap_extend,) File "CRISPResso2/CRISPResso2Align.pyx", line 382, in CRISPResso2.CRISPResso2Align.global_align Exception: ('wtf4!:pointer: %i', 0)

CRITICAL @ Tue, 17 May 2022 21:42:22: Unexpected error, please check your input.

ERROR: ('wtf4!:pointer: %i', 0)

I don't know if CRISPResso2 could be used for long amplicon.

Colelyman commented 2 years ago

I suspect that this is happening because the alignment is requiring too much memory. Do you happen to know how much RAM your computer has?

kclem commented 1 year ago

We're closing this issue because it hasn't been updated recently. If this issue still exists, please reopen this issue and we'll look into it!

assafgrw commented 1 year ago

Hello, I see that this thread was closed with no actual answear. I will try to ask again:

Is it possible to use CRISPReso2 for long amplicons (~4000 bp) with long read sequencing (ONT)?