boutroslab / CRISPRAnalyzeR

CRISPRAnalyzeR: interactive analysis, annotation and documentation of pooled CRISPR screens
GNU General Public License v2.0
80 stars 33 forks source link

Default Values #26

Closed DarioS closed 6 years ago

DarioS commented 7 years ago

Using the version 1.20 beta Docker container to run the software, I clicked on GECKO V2 A+B button, but the number of extracted sequences matched to guides is low.

sgRNA Extraction Ratio: 99.35%
Not aligned: 99.67%
Aligned once: 0.33%

I think the cause of this problem is the default pattern CACC(.{20}). Note that it should instead be CACCG(.{20}) for GECKO V2 A+B, according to the specifications of the primer (i.e. TCTTGTGGAAAGGACGAAACACCG). Using the alternative pattern on the same sample, I find:

sgRNA Extraction Ratio: 84.97%
Not aligned: 31.15%
Aligned once: 67.31%

Or maybe CACCG?(.{20}).

sgRNA Extraction Ratio: 100%
Not aligned: 35.29%
Aligned once: 64.71%

I think that the dataset which I'm using is representative of such experiments, but reads of such screens are never made publicly available, so I can't be certain.

Also, the Set Controls panel in the Set Analysis Parameters tab remains empty, although the "Gene Identifier for non-targeting control(s)" input box should have a default list (i.e. NonTargetingControlGuideForHuman_0001_1, ..., NonTargetingControlGuideForHuman_1000_1).

Lastly, the low count filtering panel has a description box "... we recommend to remove sgRNAs with a read count of less than 20." but the default value in the input box is 0. Perhaps also make the input's default 20 for consistency with the recommendation?

jwinter6 commented 7 years ago

Hi Dario,

thanks for your feedback!

And again, you are completely right :) For some reason the controls included in the FASTA file are not set as default (which should be the case), so I will fix it with the next minor release 1.21.

The same applies for the removal of low read counts.

Best Jan