pinellolab / CRISPResso2

Analysis of deep sequencing data for rapid and intuitive interpretation of genome editing experiments
Other
277 stars 95 forks source link

CRISPresso2Pooled amplicon mode description file fails #317

Closed Jdbeck66 closed 1 year ago

Jdbeck66 commented 1 year ago

Describe the bug Running CRISPRessoPooled, results in failure message.

Unable to find matches for header values. Using the default header values and order. ERROR: Incorrect number of columns provided without header.

Documentation README says up to 12 columns, in code documentation says 14 columns maximum on line 298 of CRISPRessoPooledCORE.py

Expected behavior CRISPRessoPooled to ingest description file or better documentation regarding the file format required by the code.

To reproduce

docker run -v ${PWD}:/DATA -w /DATA -i pinellolab/crispresso2 CRISPRessoPooled -r1 R1.fastq -r2 R2.fastq -f description.txt --name AMPLICONS_SRR13017204 --debug

Debug output INFO @ Sun, 23 Jul 2023 20:28:13: Creating Folder CRISPRessoPooled_on_AMPLICONS_SRR13017204

INFO @ Sun, 23 Jul 2023 20:28:13: Done!

INFO @ Sun, 23 Jul 2023 20:28:13: Checking dependencies...

INFO @ Sun, 23 Jul 2023 20:28:13: All the required dependencies are present!

INFO @ Sun, 23 Jul 2023 20:28:13: Only the Amplicon description file was provided. The analysis will be perfomed using only the provided amplicons sequences.

INFO @ Sun, 23 Jul 2023 20:28:13: Processing input

INFO @ Sun, 23 Jul 2023 20:28:13: Merging paired sequences with Flash...

INFO @ Sun, 23 Jul 2023 20:28:13: Flash command: flash --allow-outies R1.fastq R2.fastq --max-overlap 100 --min-overlap 10 -z -d CRISPRessoPooled_on_AMPLICONS_SRR13017204 >>CRISPRessoPooled_on_AMPLICONS_SRR13017204/CRISPRessoPooled_RUNNING_LOG.txt 2>&1

INFO @ Sun, 23 Jul 2023 20:28:16: Done!

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header amplicon_name with amplicon_name.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header amplicon_sequence with amplicon_seq.

WARNING @ Sun, 23 Jul 2023 20:28:16: Unable to find matches for header values. Using the default header values and order.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header expected_amplicon_after_hdr with expected_hdr_amplicon_seq.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header coding_sequence with coding_seq.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header prime_editing_pegrna_spacer_seq with prime_editing_pegRNA_spacer_seq.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header prime_editing_nicking_guide_seq with prime_editing_nicking_guide_seq.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header prime_editing_pegrna_extension_seq with prime_editing_pegRNA_extension_seq.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header prime_editing_pegrna_scaffold_seq with prime_editing_pegRNA_scaffold_seq.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header prime_editing_pegrna_scaffold_min_match_length with prime_editing_pegRNA_scaffold_min_match_length.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header prime_editing_override_prime_edited_ref_seq with prime_editing_override_prime_edited_ref_seq.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header quantification_window_coordinates with quantification_window_coordinates.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header quantification_window_size with quantification_window_size.

INFO @ Sun, 23 Jul 2023 20:28:16: Matching header quantification_window_center with quantification_window_center.

CRITICAL @ Sun, 23 Jul 2023 20:28:16:

ERROR: Incorrect number of columns provided without header.

                         ~~~CRISPRessoPooled~~~
  -Analysis of CRISPR/Cas9 outcomes from POOLED deep sequencing data-

          _                                                   _
         '  )                                                '  )
         .-'            _______________________              .-'
        (____          | __  __  __     __ __  |            (____
     C)|     \         ||__)/  \/  \|  |_ |  \ |         C)|     \
       \     /         ||   \__/\__/|__|__|__/ |           \     /
        \___/          |_______________________|            \___/

                      [CRISPResso version 2.2.12]

[Note that starting in version 2.3.0 FLASh and Trimmomatic will be replaced by fastp for read merging and trimming. Accordingly, the --flash_command and --trimmomatic_command parameters will be replaced with --fastp_command. Also, --trimmomatic_options_string will be replaced with --fastp_options_string.

Also in version 2.3.0, when running CRISPRessoPooled in mixed-mode (amplicon file and genome are provided) the default behavior will be as if the --demultiplex_only_at_amplicons parameter is provided. This change means that reads and amplicons do not need to align to the exact locations.] [For support contact kclement@mgh.harvard.edu or support@edilytics.com]

Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/CRISPResso2-2.2.12-py3.10-linux-x86_64.egg/CRISPResso2/CRISPRessoPooledCORE.py", line 674, in main raise CRISPRessoShared.BadParameterException('Incorrect number of columns provided without header.') CRISPResso2.CRISPRessoShared.BadParameterException: Incorrect number of columns provided without header.

RAG2.amplicon.description.txt

Jdbeck66 commented 1 year ago

I think I figured out the issue - I looked in the code for PooledCore and found the accepted header names. If you could maybe fix the readme to reflect those names

Snicker7 commented 1 year ago

Hello @Jdbeck66, thank you for reporting this bug. We have a fix ready and are going to be updating the conda package soon. If you would like to use the fixed version sooner you can pull the master branch of the CRISPResso2 repo.

Jdbeck66 commented 1 year ago

Thanks, JB

Jdbeck66 commented 1 year ago

I ran it from the latest docker container.

On Tue, Jul 25, 2023 at 11:46 AM Samuel Nichols @.***> wrote:

Hello @Jdbeck66 https://github.com/Jdbeck66, Could you verify that you have pulled the latest version of master? The warning has been updated to include which one of the headers failed to match which would help figure out why it failed. Thank you!

— Reply to this email directly, view it on GitHub https://github.com/pinellolab/CRISPResso2/issues/317#issuecomment-1650356000, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZNPS2TP7RZ7SDSJSEJYGTXSAH7DANCNFSM6AAAAAA2UXPUQY . You are receiving this because you were mentioned.Message ID: @.***>