refresh-bio / SPLASH

57 stars 6 forks source link

call SNPs from thousands individuals #18

Closed gorliver closed 6 months ago

gorliver commented 7 months ago

The concept is quick attracting. I wonder is it possible to call SNPs from a large population? If so, would you mind share the pipeline?

roozbehdn commented 7 months ago

Hi gorliver,

Yes, you can run SPLASH on datasets of anysize to obtain SNPs. One potential approach for identifying SNPs is to run the SPLASH_extendor_classification.R script from the repository and then looking at those anchors classified as Base_pair_change_ (these are anchors whose target sequence diversity could be explain by single base pair changes (i.e., SNPs or mutations). The number after _ gives you the number of base pairs that are different between the top two targets of the anchor. Hope this helps you.

gorliver commented 6 months ago

Sounds great! Definitely will try it.