philarevalo / PopCOGenT

Microbial Populations as Clusters Of Gene Transfer
GNU General Public License v3.0
43 stars 12 forks source link

Mapping of positions in the output core gene sweep #34

Open Xiaojun928 opened 1 year ago

Xiaojun928 commented 1 year ago

Hi,

I want to extract the sequences based on the output of *.core_sweeps.csv, which provided the start and end positions. In the README, you mentioned that

*.core_sweeps.csv: The positions (in the coordinates of the whole genome alignment) of core genome sweeps.

I guess the align/*.core.fasta is not what you mentioned, as it was concatenated and only contains core genome. Besides, it has gap. The align/*maf seems to be the whole genome alignment, while it was not concatenated, making it hard to mapping the position. So I was wondering if the positions are in coordinates of the reference genome, which was provided in the phybreak_parameters.txt at ref_iso.

Thanks in advance!

Best, Xiaojun