boutroslab / CRISPRAnalyzeR

CRISPRAnalyzeR: interactive analysis, annotation and documentation of pooled CRISPR screens
GNU General Public License v2.0
80 stars 33 forks source link

Gene Symbols in Pre-made Library #24

Closed DarioS closed 7 years ago

DarioS commented 7 years ago

For GeCKO V2 libraries, some gRNAs hit multiple genes by design (i.e. they have different gRNA ID but same DNA sequence). Currently, only one gene symbol per DNA sequence is shown. e.g.

>POTEE_TCATAGGACTGCTCTACATC
tcataggactgctctacatc

Is it feasible to instead represent it using all of the genes?

>POTEE,POTEF,POTEG,POTEH,POTEI,POTEJ,POTEM_TCATAGGACTGCTCTACATC
tcataggactgctctacatc

I'm not sure what kind of bias assigning each DNA sequence to the alphabetically first gene symbol might introduce or what effect it has on tasks like differential expression analysis. I also don't know why the inventors of GeCKO chose such an annoying experimental design.

jwinter6 commented 7 years ago

Hi, sorry for getting back that late to you.

I am sorry but this is not possible, as the analysis requires unique entries for being consistent and reproducible. Luckily, only the GeckoV2 library contains such elements (to my knowledge). The only feasible solution is to just keep that in mind when doing the analysis.

Best Jan