mckennalab / FlashFry

FlashFry: The rapid CRISPR target site characterization tool
Other
63 stars 10 forks source link

Is random flanking sequence required? #28

Closed Huanle closed 2 years ago

Huanle commented 2 years ago

Hi There,

What is the point of adding random flanking sequences of sgRNA when attempting to discover target and off-target sites?

java -Xmx4g -jar FlashFry-assembly-1.12.jar \ discover \ --database chr22_cas9ngg_database \ --fasta EMX1_GAGTCCGAGCAGAAGAAGAAGGG.fasta \ --output EMX1.output

Thanks a lot in advance.

aaronmck commented 2 years ago

If I understand your question correctly, we include the flanking sequence (which isn't random) to the targets as many on-target evaluation methods need the context to produce a score. See https://www.nature.com/articles/nbt.3026 as an example

Huanle commented 2 years ago

Thanks @aaronmck for your prompt reply.

I saw the description from the quick start tutorial.

Now we discover candidate targets and their potential off-target in the test data (takes a few seconds). Here we're using the EMX1 target with some random sequence flanking the target site:

java -Xmx4g -jar FlashFry-assembly-1.12.jar \
 discover \
 --database chr22_cas9ngg_database \
 --fasta EMX1_GAGTCCGAGCAGAAGAAGAAGGG.fasta \
 --output EMX1.output

This confusesd me, a newcomer to this field:-). Thanks a lot for the explanation.

One more question regarding the flanking sequence is how many bases pairs of flanking regions should be included? Thanks again.

Huanle commented 2 years ago

Perhaps I did not make it clear. In the quick start tutorial found in the main page image What confuses me is that there is a flankingSequence option, which denotes the length of flanking sequences for a given target sequence. This makes me think that the input should be just the 20bp sgRNA sequence and the program will evaluate/score it based on denoted length of flanking sequences. But this is not the case, am I right?

aaronmck commented 2 years ago

Totally, I'll update the docs. Thanks!