szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
107 stars 33 forks source link

input data type #119

Open timurrxxcd opened 1 week ago

timurrxxcd commented 1 week ago

Hello, I know this might seem as a stupid and simple question, but I need to ask. I want to calculate iHS. for that, I have vcf files that were generated by GATK. I have used both haploid and diploid genome assembly for generating these vcf files. So, which vcf file should be used in calculating iHS? Before calculating iHS, which vcf file must be phased? vcf file derived from haploid genome or diploid genome? And, iHS must be calculated chromosome by chromosome? if I have 17 (haploid set og chromosomes) I need to run Selscan for each chromosome separately? And, is it possible to say that if I use beagle to phase and impute a vcf file generated from haploid genome as phased data? And can you please provide all example files that are needed for this analysis? I will prepare my data accordingly.

szpiech commented 1 week ago

Hi,

Ok, so selscan takes diploid vcf files. They can be phased or unphased (used —unphased flag). You would run each diploid chromosome through selscan and then jointly normalize them using the program norm.

If your organism is haploid, you can hack it a bit to get it to work with selscan. You would coerce the data into vcf diploid format, where every “diploid” genotype is either 0/0 or 1/1, and then you could use the —unphased flag, which would collapse the genotypes down.

Hope this helps,

Zachary

Le jeu. 4 juil. 2024 à 22:25, timurrxxcd @.***> a écrit :

Hello, I know this might seem as a stupid and simple question, but I need to ask. I want to calculate iHS. for that, I have vcf files that were generated by GATK. I have used both haploid and diploid genome assembly for generating these vcf files. So, which vcf file should be used in calculating iHS? Before calculating iHS, which vcf file must be phased? vcf file derived from haploid genome or diploid genome? And, iHS must be calculated chromosome by chromosome? if I have 17 (haploid set og chromosomes) I need to run Selscan for each chromosome separately?

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/119, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQRDSNASRZ7WFNCQVHTZKX72ZAVCNFSM6AAAAABKML5C2CVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4TCNRVGU2DMOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>