YichaoOU / easy_prime

prime editor gRNA design tool based on ensemble learning
MIT License
3 stars 2 forks source link

Discrepancy between easy_prime web output and cli output #5

Open marcus-r-kelly opened 1 year ago

marcus-r-kelly commented 1 year ago

Thank you for this excellent tool!

I am wondering about a discrepancy I find between the webtool and a local install. Specifically, submitting the following variant:

Produces different efficiency estimates with the tool as installed by conda following the directions. Specifically, the best solution in both cases is this : image But the installed efficiency estimate is 0.47. Is the command-line tool given out of 1.0 instead of 100%? Are there other reasons these should be so different?

YichaoOU commented 1 year ago

Thanks!

Yes, the predicted_efficiency in topX_pegRNAs.csv is from 0 to 1. If you multiply it by 100, you have the percentage.

I'm not sure why, even though they gave the same pegRNA, the predicted efficiency is different. When I used the local version, I also got 0.47004926204681396.

One thing I want to mention is that, for the vcf format, I would assume there is no overlap between ref and alt. Right now, your ref is CCACCA and your alt is CCAACG, I think it should CCA and ACG. The beginning CCA should be removed because these positions are unchanged. And then the Position becomes 25398281+3.

marcus-r-kelly commented 1 year ago

One thing I want to mention is that, for the vcf format, I would assume there is no overlap between ref and alt. Right now, your ref is CCACCA and your alt is CCAACG, I think it should CCA and ACG. The beginning CCA should be removed because these positions are unchanged. And then the Position becomes 25398281+3.

I have sanitized my variants and this fixed an entirely unrelated issue where command-line easyprime would produce pegRNAs encoding indels not described in the input variants. However, it did not change the discrepancy between the webtool and the command line utility.