pinellolab / CRISPResso2

Analysis of deep sequencing data for rapid and intuitive interpretation of genome editing experiments
Other
256 stars 91 forks source link

A small mistake in 'CRISPResso_quantification_of_editing_frequency.txt' #440

Closed lpcv0309 closed 1 month ago

lpcv0309 commented 1 month ago

Describe the bug In the analysis results of the prime editor, 'Figure 1c: Alignment and editing frequency of reads as determined by the percentage and number of sequence reads showing unmodified and modified alleles.' The values for 'Reference Unmodified%, Reference Modified%, Prime-edited Unmodified%, Prime-edited Modified%' in the associated file 'CRISPResso_quantification_of_editing_frequency.txt' differ from the figures shown. Their sum is 200%, while the sum in bar or pie charts should only be 100%.

Expected behavior Modify the calculation rules for Unmodified% and Modified% in the 'CRISPResso_quantification_of_editing_frequency.txt' file to ensure the values match the figures shown. This will facilitate exporting the data and combining results from multiple replicates for replotting. 1c Alignment_barplot CRISPResso_quantification_of_editing_frequency.txt

kclem commented 1 month ago

Hi @lpcv0309,

I'm sorry about the confusion.

Each row in CRISPResso_quantification_of_editing_frequency.txt provides statistics for a separate amplicon. The top row is the Reference(unedited) amplicon, and the second row is for the Prime-edited amplicon. The Unmodified % and Modified % columns show the percent of reads aligned to 'Reference' or 'Prime Edited' that were unmodified (or modified).

If you'd like to compute the values in the barplots/piecharts from the values in CRISPResso_quantification_of_editing_frequency.txt, simply divide the values in the Unmodified or Modified columns by the value in Reads_aligned_all_amplicons. E.g. to get the percent of all reads that were Reference Unmodified, the calculation is 2786/3240 = 0.8598765 (shown as 85.99 in the barplot above).

image

The tool CRISPRessoAggregate can be used to aggregate multiple runs and will produce a file for performing downstream statistical tests.

We don't plan on implementing any changes, but let us know if you need help with your downstream processing!