Closed sarahhcarl closed 3 years ago
Dear Sarah, It looks you are using CLD as intended. The cDNA mapping could be erroneous. I understand the confusing result. The "*" means that this sequence could not be mapped to the transcriptome. This means, it probably spans an exon/intron junction or is located in an intron. You get this report because CLD creates all possible PAMs that are encoded by NNGRRT and appends them to the target and then matches all possible sequences to the target genome. So, for one target sequence (one ID) you can get lots of hits with different PAMs (e.g ACGTACGT AAGAAT or ACGTACGT GCGAAT). If that cannot be matched to the transcriptome you get lots of "*" entries. The end-to-end function does not count those as off-targets, as they are rather mapping artifacts. I can have a look if I can improve the reporting of CLD on that side and hope my answer could shed some light.
Best and Happy New Year, Florian
Dear Florian,
Thanks for your reply, that definitely helps me understand what's going on. But just to clarify, if you look at the first 8 lines above, they all have the target sequence "TGCCCTTATTCAGGAGAGGC TGGAAT". However, what you're saying is that the off-target matches could have a PAM other than TGGAAT, right? Maybe it would be useful to add a column with the actual matched off-target sequence?
Additionally, I also got some "*" entries when I ran the query with "offtargetdb=genomeDNA" (although many fewer). Do you have any idea what those could be, as presumably they are not intronic or exon/intron junction hits?
Thanks again, Sarah
Hi,
Thanks for the work on this tool! I'm helping a collaborator design guide RNAs for a CRISPR screen in mouse, and so far it's been really useful. However, I'm having an issue when trying to design sgRNAs with a PAM sequence of NNGRRT (corresponds to S. aureus). I'm running only the target_ident task, and it predicts many appropriate targets - however, it also predicts many more off-targets than for the standard NGG PAM, with all other parameters the same. Beyond that, when I examine the predicted off-targets, many of them have no genomic location (Match.Target = "*" and Match.Chromosome = "NA"). I am using the parameter "offtargetdb=cDNA", so I expect to get off-targets only within exons, if I understand it correctly.
Here is a look at the first 17 lines of my "all_results_together.tab" file, which appears to predict 2 real targets for the Xkr4 gene:
I am confused about how to interpret these predictions. Do you think this is a bug, or are all of these off-targets real? If they are real, why do they apparently have no location?
Thanks for your help, Sarah