Open ciselkemahli opened 5 months ago
Hi Cisel,
I don't know what the maximum number of SNPs is off the top of my head, but I do know that the model itself does not account for physical linkage, so, once you have more then one SNP per chromosome, the model starts being violated. If you use thousands of SNPs the estimates of uncertainly around the hybrid categories will be greatly deflated because most of the SNPs are co-inherited with others on the same chromosome, and the model does not account for that.
If you have a genome's worth of data, I would recommend finding SNPs that are most differentiated between the two parental species, and then from amongst those, choosing no more than 3 or 4 per chromosome, preferably on different arms or ends.
How differentiated are the species you are dealing with?
Cheers,
eric
On Wed, May 29, 2024 at 1:08 AM Çisel Kemahlı @.***> wrote:
Hello, I wonder that how many SNPs can handle the NEWHYRIDS program? I install it to the university server and it is running with ~180GB memory. Do you have any guess for optimal analysis? Thank you.
— Reply to this email directly, view it on GitHub https://github.com/eriqande/newhybrids/issues/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPQ4JW3IFCCKSTQ2W77WPTZEV5GVAVCNFSM6AAAAABIOJXSLOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMZDENJUGA4TQOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Dear Eric,
Thank you so much for the quick response. As you explained, I should decrease my SNP number to hundreds because of deflation.
I am trying to figure out hybrid wolves using WGS data. I do not have dog genomes from the same region, so I have only wolf genomes to determine F1, F2 generations, and backcrosses. As you suggested, I will reduce SNPs based on the linkage. My problem is understanding hybrid individuals without prior information about one of the parents. That's why I tried to do this analysis of as many SNPs as possible. Thank you.
Best wishes.
Çisel
Hi Çisel,
I see. In that case, you might want to do a first run with ADMIXTURE and K=2 and the wolves as individuals known to be from one source population. Then use the estimated allele frequencies from the two subpopulations to identify a small subset of loci that you could then use to identify F1, F2, and backcrosses, etc.
NewHybrids does not mix terribly well with many markers. The Structure/ADMIXTURE model tends to produce better mixing.
I hope that is helpful. Cheers,
eric
On Thu, May 30, 2024 at 1:46 AM Çisel Kemahlı @.***> wrote:
Dear Eric,
Thank you so much for the quick response. As you explained, I should decrease my SNP number to hundreds because of deflation.
I am trying to figure out hybrid wolves using WGS data. I do not have dog genomes from the same region, so I have only wolf genomes to determine F1, F2 generations, and backcrosses. As you suggested, I will reduce SNPs based on the linkage. My problem is understanding hybrid individuals without prior information about one of the parents. That's why I tried to do this analysis of as many SNPs as possible. Thank you.
Best wishes.
Çisel
— Reply to this email directly, view it on GitHub https://github.com/eriqande/newhybrids/issues/9#issuecomment-2138888146, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPQ4JRO4PAQW25WRRIP6SDZE3KOJAVCNFSM6AAAAABIOJXSLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZYHA4DQMJUGY . You are receiving this because you commented.Message ID: @.***>
Hello, I wonder that how many SNPs can handle the NEWHYRIDS program? I installed it to the university server and it is running with ~180GB memory. Do you have any guess for optimal analysis? Thank you.