JiaoLab2021 / SynDiv

A tool for quick and accurate calculation of syntenic diversity.
MIT License
26 stars 0 forks source link

Population Syn_FST #3

Open tongyin121 opened 3 weeks ago

tongyin121 commented 3 weeks ago

I am curious about the process of calculating Syn_FST. The .cal files were generated by syri.out and .align files. However, how can we obtain a .cal file from second-generation sequencing data?

JiaB-He commented 3 weeks ago

Currently, synteny diversity and Syn-FST can only be calculated using genomic data. Syn-FST is derived from synteny diversity. The most effective tool for identifying syntenic and non-syntenic regions is SyRI, which relies on genome alignment. However, no tool currently exists that can identify syntenic regions using third-generation or second-generation sequencing data. In the future, we plan to incorporate sequencing data to calculate synteny diversity.

tongyin121 commented 3 weeks ago

thank you for your reply, It resolved my confusion.

tongyin121 commented 3 weeks ago

another question is how can I get the values of windows(maybe 10k) of SynFst ? I think I can reconstruct the SynFst file and then use SynDiv_c window command to get what I want. However I am confused about the States columns that are in .cal file. Could you please tell me if the method is correct and the meanings of the States columns.

duzezhen commented 2 weeks ago

The All_States column represents the total number of pairings in the population, and Syntenic_States represents the number of syntenic pairings. For example, if there are three genomes A, B, and C, the possible pairings would be AB, AC, and BC. If only AB is syntenic, then All_States would be 3, and Syntenic_States would be 1. Syntenic_Diversity is calculated as 1 - 1/3.