Matteo-Ciciani / PAMpredict

A python package to predict CRISPR-Cas PAMs
Other
6 stars 1 forks source link

PAMpredict is sensitive to CRISPR input orientation #1

Closed snayfach closed 1 year ago

snayfach commented 1 year ago

Hi Matteo - thanks for the great tool!

I've applied PAMpredict to a large number of individual CRISPR arrays from Cas9 genes using PAMpredict's default parameters. I've noticed that PAMpredict is much more likely to detect a PAM when the CRISPR array is in the opposite orientation relative to Cas9 (bottom/top strand or top/bottom strand). And that switching the orientation of the CRISPR array changes the PAM identified. My understanding was that PAMpredict looks for a PAM in both orientations, so I was surprised by the behavior. It was also weird that the tool works better when the CRISPR is input in the wrong orientation.

Happy to provide some examples if you have the bandwidth to help me out!

Matteo-Ciciani commented 1 year ago

Hi Stephen, thanks for the feedback. You are correct, PAMpredict looks for PAMs in both directions, so the array orientation shouldn't matter. Could you provide some examples of this behavior?

snayfach commented 1 year ago

Here's one example where no PAM is detected initially, but a clear downstream NGG PAM is detected after reverse complementing the spacers. There are other examples where the opposite pattern is found. But in both cases, the orientation that works is where the array is input in the opposite orientation relative to Cas9.

AAANUU010000033.1_2.fna.zip AAANUU010000033.1_2_revcomp.fna.zip

And here's one more like that: BackhedF_2015__SID78_4M_CRISPR_NODE_181_length_69004_cov_66.5387_20791.fna.zip BackhedF_2015__SID78_4M_CRISPR_NODE_181_length_69004_cov_66.5387_20791_revcomp.fna.zip

Matteo-Ciciani commented 1 year ago

Hi Stephen, it should be fixed now, there was an upstream instead of downstream in the code that messed up predictions only in one direction.

snayfach commented 1 year ago

The results in downstream_flanking_sequence_info.tsv between the forward and reverse orientations are now very similar, but not exactly the same.

If you take a look at the info content of the first position: 0 0.0890482633619067 0.012721180480272385 0.05088472192108954 0.0890482633619067

And compare that to the reverse complement: 0 0.0890482633619067 0.0890482633619067 0.05088472192108954 0.012721180480272385

The spacer alignment stats are identical, so it's not due to alignments. And if you rerun the program the results are identical, so it's not due to a random process.

Do you have an idea what's causing the difference?

Matteo-Ciciani commented 1 year ago

Hi Stephen, I couldn't reproduce exactly what you observed, but the issue should be fixed now. The results should be almost identical, up to floating point precision. If it's not fixed, could you send me the blastn/blastout.tsv file that is generated with the --keep_tmp option?

snayfach commented 1 year ago

Perfect - fixed now. Thanks!