Closed JNWorkman closed 7 months ago
Hi @JNWorkman nothing jumps out at me from the logs.
Are there reads assigned to the amplicon/guide? If there are no reads, there will be no plot 9.
If you run the single CRISPResso command (not CRISPRessoBatch) does it produce the file?
Hi @kclem
Thanks for looking into this.
There are reads assigned to the amplicon, yes. Nothing else about the analysis is unusual except the missing plot. Running the single command yields the same results.
I did figure out the problem however. My batchfile did not include a sequence for the guide, only the amplicon. Including the guide sequence fixed the issue and the plot is generated correctly. Apologies for the hassle over a simple mistake on my part.
A related question, is there a way to alter which positions the zoomed in plots like the allele frequency table cover? For example, say I used a pegRNA that is at position 100 in the amplicon, but the actual edited nucleotide I'm interested in assessing is at position 200. Can I tell crispresso what window to use for the plots instead of the default "around sgRNA"?
thanks!
Hi @JNWorkman,
Glad to hear that you got it figured out!
The easiest way to achieve what you are looking to do with the allele frequency table is with the https://github.com/pinellolab/CRISPResso2/blob/master/scripts/plotCustomAllelePlot.py script. You can use the --plot_center
argument to set where the plot is centered.
Let us know if you have any trouble.
Thanks, Cole
Describe the bug There is no png or pdf file generated for the allele frequency table. I do get the .zip archive containing the allele frequency table in .txt format, but no graphic. Usually crispresso generates a file called "9.Alleles_frequency_table_aroundsgRNA[sequence].pdf" that contains a quilt visualization of the allele frequency table; this is missing.
I have not updated or changed anything since last running crispresso except downgrading matlib to address the missing annotations from this plot.
Expected behavior Usually crispresso generates a file called "9.Alleles_frequency_table_aroundsgRNA[sequence].pdf" that contains a quilt visualization of the allele frequency table.
To reproduce
CRISPRessoBatch
--batch_settingsW-ko_batchfile.txt
example line from batchfile: r1 g a n w wc q TET-NG-W1291X-579-1_S13_L001_R1_001.fastq.gz ttctgtgtgtggttatgccacagcttaatacagagttagattagacttcttttcaaactcattttgcatatagacacctataatatcagctgcacagcctatataatgctatccatagcaatgaatttggtcttttgatttttcaggagaacttgcgcctgtcaggggctggatccagaaacctgtggtgcctccttctcttttggttgttcatggagcatgtactacaatggatgtaagtttgccagaagcaagatcccaaggaagtttaagctgcttggggatgac TET-NG-W1291X-579-1 20 -10 30
Debug output Paste the entire output when you run CRISPResso with the flag
--debug
.[Note that starting in version 2.3.0 FLASh and Trimmomatic will be replaced by fastp for read merging and trimming. Accordingly, the --flash_command and --trimmomatic_command parameters will be replaced with --fastp_command. Also, --trimmomatic_options_string will be replaced with --fastp_options_string.
Also in version 2.3.0, when running CRISPRessoPooled in mixed-mode (amplicon file and genome are provided) the default behavior will be as if the --demultiplex_only_at_amplicons parameter is provided. This change means that reads and amplicons do not need to align to the exact locations.] [For support contact k.clement@utah.edu or support@edilytics.com]
INFO @ Mon, 01 Apr 2024 12:56:48: Creating Folder /home/noah/crispresso_data/2024.4.1_NW-ST_GNE-TET/test/CRISPRessoBatch_on_test_batchfile
/home/noah/anaconda3/envs/crispresso2_env/lib/python3.10/site-packages/CRISPResso2/CRISPRessoBatchCORE.py:190: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. batch_params[arg].fillna(value=getattr(args, arg), inplace=True) INFO @ Mon, 01 Apr 2024 12:56:48: Running CRISPResso with 1 processes
[Note that starting in version 2.3.0 FLASh and Trimmomatic will be replaced by fastp for read merging and trimming. Accordingly, the --flash_command and --trimmomatic_command parameters will be replaced with --fastp_command. Also, --trimmomatic_options_string will be replaced with --fastp_options_string.
Also in version 2.3.0, when running CRISPRessoPooled in mixed-mode (amplicon file and genome are provided) the default behavior will be as if the --demultiplex_only_at_amplicons parameter is provided. This change means that reads and amplicons do not need to align to the exact locations.] [For support contact k.clement@utah.edu or support@edilytics.com]
INFO @ Mon, 01 Apr 2024 12:56:49: Creating Folder /home/noah/crispresso_data/2024.4.1_NW-ST_GNE-TET/test/CRISPRessoBatch_on_test_batchfile/CRISPResso_on_TET-NG-W1291X-579-1
INFO @ Mon, 01 Apr 2024 12:56:49: Computing quantification windows
INFO @ Mon, 01 Apr 2024 12:56:49: Filtering reads with average bp quality < 30 and single bp quality < 0 and replacing bases with quality < 0 with N ...
Completed in 5 seconds
INFO @ Mon, 01 Apr 2024 12:56:55: Aligning sequences...
INFO @ Mon, 01 Apr 2024 12:56:55: Processing reads; N_TOT_READS: 0 N_COMPUTED_ALN: 0 N_CACHED_ALN: 0 N_COMPUTED_NOTALN: 0 N_CACHED_NOTALN: 0
INFO @ Mon, 01 Apr 2024 12:56:56: Processing reads; N_TOT_READS: 10000 N_COMPUTED_ALN: 3386 N_CACHED_ALN: 6575 N_COMPUTED_NOTALN: 38 N_CACHED_NOTALN: 1
INFO @ Mon, 01 Apr 2024 12:56:57: Processing reads; N_TOT_READS: 20000 N_COMPUTED_ALN: 6074 N_CACHED_ALN: 13838 N_COMPUTED_NOTALN: 87 N_CACHED_NOTALN: 1
INFO @ Mon, 01 Apr 2024 12:56:58: Processing reads; N_TOT_READS: 30000 N_COMPUTED_ALN: 8102 N_CACHED_ALN: 21775 N_COMPUTED_NOTALN: 122 N_CACHED_NOTALN: 1
INFO @ Mon, 01 Apr 2024 12:56:58: Finished reads; N_TOT_READS: 30640 N_COMPUTED_ALN: 8234 N_CACHED_ALN: 22282 N_COMPUTED_NOTALN: 123 N_CACHED_NOTALN: 1
INFO @ Mon, 01 Apr 2024 12:56:58: Done!
INFO @ Mon, 01 Apr 2024 12:56:58: Quantifying indels/substitutions...
INFO @ Mon, 01 Apr 2024 12:56:59: Done!
INFO @ Mon, 01 Apr 2024 12:56:59: Calculating allele frequencies...
INFO @ Mon, 01 Apr 2024 12:56:59: Done!
INFO @ Mon, 01 Apr 2024 12:56:59: Saving processed data...
INFO @ Mon, 01 Apr 2024 12:56:59: Making Plots...
DEBUG @ Mon, 01 Apr 2024 12:56:59: Plotting read bar plot
DEBUG @ Mon, 01 Apr 2024 12:56:59: Plotting read class pie chart and bar plot
INFO @ Mon, 01 Apr 2024 12:57:00: Begin processing plots for amplicon Reference
DEBUG @ Mon, 01 Apr 2024 12:57:00: Plotting nucleotide quilt across amplicon
/home/noah/anaconda3/envs/crispresso2_env/lib/python3.10/site-packages/CRISPResso2/CRISPRessoPlot.py:188: FutureWarning: Series.getitem treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use
ser.iloc[pos]
ins_pct = float(mod_pct_df_indexed.loc[sampleName,'Insertions_Left'][pos_ind-2]) DEBUG @ Mon, 01 Apr 2024 12:57:01: Plotting indel size distribution for ReferenceDEBUG @ Mon, 01 Apr 2024 12:57:01: Plotting frequency deletions/insertions for Reference
DEBUG @ Mon, 01 Apr 2024 12:57:02: Plotting amplication modifications for Reference
DEBUG @ Mon, 01 Apr 2024 12:57:02: Plotting modification frequency for Reference
DEBUG @ Mon, 01 Apr 2024 12:57:03: Plotting quantification window locations for Reference
DEBUG @ Mon, 01 Apr 2024 12:57:03: Plotting position dependent indel for Reference
INFO @ Mon, 01 Apr 2024 12:57:03: Done!
INFO @ Mon, 01 Apr 2024 12:57:03: Done!
INFO @ Mon, 01 Apr 2024 12:57:03: Removing Intermediate files...
INFO @ Mon, 01 Apr 2024 12:57:04: Analysis Complete!
INFO @ Mon, 01 Apr 2024 12:57:04: Completed 1/1 runs
INFO @ Mon, 01 Apr 2024 12:57:04: Finished all batches
INFO @ Mon, 01 Apr 2024 12:57:04: Reporting summary for amplicon: "Reference"
DEBUG @ Mon, 01 Apr 2024 12:57:04: Plotting nucleotide quilt for Reference
/home/noah/anaconda3/envs/crispresso2_env/lib/python3.10/site-packages/CRISPResso2/CRISPRessoPlot.py:188: FutureWarning: Series.getitem treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use
ser.iloc[pos]
ins_pct = float(mod_pct_df_indexed.loc[sampleName,'Insertions_Left'][pos_ind-2]) DEBUG @ Mon, 01 Apr 2024 12:57:05: Plotting allele modification heatmap for ReferenceDEBUG @ Mon, 01 Apr 2024 12:57:05: Plotting allele modification line plot for Reference
DEBUG @ Mon, 01 Apr 2024 12:57:05: Plotting allele modification heatmap for Reference
DEBUG @ Mon, 01 Apr 2024 12:57:05: Plotting allele modification line plot for Reference
DEBUG @ Mon, 01 Apr 2024 12:57:05: Plotting allele modification heatmap for Reference
DEBUG @ Mon, 01 Apr 2024 12:57:05: Plotting allele modification line plot for Reference
INFO @ Mon, 01 Apr 2024 12:57:05: Analysis Complete!