crisprVerse / crisprDesign

Comprehensive design of CRISPR gRNAs for nucleases and base editors
MIT License
16 stars 5 forks source link

`max_mm` vs `n_mismatches` #17

Closed MatthewPace98 closed 1 year ago

MatthewPace98 commented 1 year ago

You can specify the number of mismatches for addSpacerAlignments using n_mismatches and you can do the same for addOffTargetScores using max_mm. Should there ever be a difference between the value for max_mm and for n_mismatches?

MatthewPace98 commented 1 year ago

Closing this to clean up this issue section.

To clarify, while both variables refer to a maximum number of mismatches, the difference is that: n_mismatches is used by the aligner as a threshold for valid alignment between the reference genome and spacers. max_mm is used for the calculation of off-target scores to set the maximum number of mismatches between each off target and wildtype sequence.

If I understood correctly (@Jfortin1 please correct me if I am wrong), if for example n_mismatches = 4 and max_mm = 3, off-target sequences with 4 mismatches will be filtered out at the scoring phase either way. And similarly, n_mismatches = 3 and max_mm = 4 would also be useless since off-targets with 4 mm would be filtered in the initial stage

The only scenario I can think of when these two variables should be different is if you set n_mismatches to a higher value, it allows the alignment stage to find and store more potential off-target sequences. You may then use the max_mm parameter to restrict the off-target score calculation to a smaller subset of those sequences which could be useful if you want to assess the off-target effects with different mismatch without having to rerun the entire alignment process for each value.

Jfortin1 commented 1 year ago

@MatthewPace98 Yes, you got it right, we'll add more information on the documentation to help with this.