Closed jackzhong1995 closed 2 months ago
Dear Jie
Thanks for pointing this out to me! This was an error during the liftover of archaic genomes from hg19 to hg38. I have updated all the bcf files in the zenodo repository so now the issue should be solved! The new zenodo reposotory is "https://zenodo.org/records/13368126".
I dont think it will matter much for your analysis as less than 0.01% of sites were lifted over to the wrong chromosome!
发件人:LauritsSkov @.> 发送时间:2024年8月24日(星期六) 06:09 @.> @.>; @.> 主 题:Re: [LauritsSkov/Introgression-detection] vcf file of chr2 but hava chr1 site (Issue #8) Dear Jie Thanks for pointing this out to me! This was an error during the liftover of archaic genomes from hg19 to hg38. I have updated all the bcf files in the zenodo repository so now the issue should be solved! The new zenodo reposotory is "https://zenodo.org/records/13368126 <https://zenodo.org/records/13368126 >". I dont think it will matter much for your analysis as less than 0.01% of sites were lifted over to the wrong chromosome! — Reply to this email directly, view it on GitHub <https://github.com/LauritsSkov/Introgression-detection/issues/8#issuecomment-2307870129 >, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AYHIA2FLJZQQFNONUJC5RJLZS6XJXAVCNFSM6AAAAABLPWBKT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXHA3TAMJSHE >. You are receiving this because you authored the thread.Message ID: @.***>
Hi Jie
You are using hmmix to look for archaic introgression in an African sample? So you are removing all SNPs in this individual which are found in other African genomes? What are your trained parameters? It could be that the two states are actually not corresponding to a Neanderthal and a human state but rather the model is overfitting and just splitting the human state into two states. If you want to find archaic introgressed segments in Africans I think a software like IBD-mix would be better suited for this!
Best Laurits
Den tirs. 27. aug. 2024 kl. 01.47 skrev jackzhong1995 < @.***>:
Hi Skov! Thanks for your rapidly reply. Here I got some confusion about using hmmix to analysis AFR samples, as you can see: After decode, I got the result (my data is unphased), the first fragment is too large (~16 Mb, although the mean_prob more than 0.9) , and I detected in total 1079.6 Mb fragment from the sample NA20412 which come from Africa. Obviously, this is quite unusual. Additionally, the fragment lengths I detected in samples from other continents are within the normal range (~80 Mb in total from one sample). So, how do you think about this unusual problem. Should there be a threshold for length of each fragments, such as 500 Kb, 1 Mb or 1.5 Mb (I noticed that "SI Figure 2.6.1 Length distribution of all fragments" in your Nature paper)? Best wishes, Jie.
发件人:LauritsSkov @.> 发送时间:2024年8月24日(星期六) 06:09 @.> @.>; @.> 主 题:Re: [LauritsSkov/Introgression-detection] vcf file of chr2 but hava chr1 site (Issue #8) Dear Jie Thanks for pointing this out to me! This was an error during the liftover of archaic genomes from hg19 to hg38. I have updated all the bcf files in the zenodo repository so now the issue should be solved! The new zenodo reposotory is "https://zenodo.org/records/13368126 < https://zenodo.org/records/13368126 >". I dont think it will matter much for your analysis as less than 0.01% of sites were lifted over to the wrong chromosome! — Reply to this email directly, view it on GitHub < https://github.com/LauritsSkov/Introgression-detection/issues/8#issuecomment-2307870129
, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AYHIA2FLJZQQFNONUJC5RJLZS6XJXAVCNFSM6AAAAABLPWBKT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXHA3TAMJSHE . You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/LauritsSkov/Introgression-detection/issues/8#issuecomment-2311928851, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHHKZGRAQGMK6YUFM4NYWL3ZTQ4KVAVCNFSM6AAAAABLPWBKT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJRHEZDQOBVGE . You are receiving this because you modified the open/close state.Message ID: @.***>
Hello Skov Thanks for your valuable suggestions. Here I have another two questions:
发件人:LauritsSkov @.> 发送时间:2024年8月28日(星期三) 04:35 @.> @.>; @.> 主 题:Re: [LauritsSkov/Introgression-detection] vcf file of chr2 but hava chr1 site (Issue #8) Hi Jie You are using hmmix to look for archaic introgression in an African sample? So you are removing all SNPs in this individual which are found in other African genomes? What are your trained parameters? It could be that the two states are actually not corresponding to a Neanderthal and a human state but rather the model is overfitting and just splitting the human state into two states. If you want to find archaic introgressed segments in Africans I think a software like IBD-mix would be better suited for this! Best Laurits Den tirs. 27. aug. 2024 kl. 01.47 skrev jackzhong1995 < @.***>:
Hi Skov! Thanks for your rapidly reply. Here I got some confusion about using hmmix to analysis AFR samples, as you can see: After decode, I got the result (my data is unphased), the first fragment is too large (~16 Mb, although the mean_prob more than 0.9) , and I detected in total 1079.6 Mb fragment from the sample NA20412 which come from Africa. Obviously, this is quite unusual. Additionally, the fragment lengths I detected in samples from other continents are within the normal range (~80 Mb in total from one sample). So, how do you think about this unusual problem. Should there be a threshold for length of each fragments, such as 500 Kb, 1 Mb or 1.5 Mb (I noticed that "SI Figure 2.6.1 Length distribution of all fragments" in your Nature paper)? Best wishes, Jie.
发件人:LauritsSkov @.> 发送时间:2024年8月24日(星期六) 06:09 @.> @.>; @.> 主 题:Re: [LauritsSkov/Introgression-detection] vcf file of chr2 but hava chr1 site (Issue #8) Dear Jie Thanks for pointing this out to me! This was an error during the liftover of archaic genomes from hg19 to hg38. I have updated all the bcf files in the zenodo repository so now the issue should be solved! The new zenodo reposotory is "https://zenodo.org/records/13368126 <https://zenodo.org/records/13368126 > < https://zenodo.org/records/13368126 <https://zenodo.org/records/13368126 > >". I dont think it will matter much for your analysis as less than 0.01% of sites were lifted over to the wrong chromosome! — Reply to this email directly, view it on GitHub < https://github.com/LauritsSkov/Introgression-detection/issues/8#issuecomment-2307870129 <https://github.com/LauritsSkov/Introgression-detection/issues/8#issuecomment-2307870129 >
, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AYHIA2FLJZQQFNONUJC5RJLZS6XJXAVCNFSM6AAAAABLPWBKT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXHA3TAMJSHE <https://github.com/notifications/unsubscribe-auth/AYHIA2FLJZQQFNONUJC5RJLZS6XJXAVCNFSM6AAAAABLPWBKT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXHA3TAMJSHE > . You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/LauritsSkov/Introgression-detection/issues/8#issuecomment-2311928851 https://github.com/LauritsSkov/Introgression-detection/issues/8#issuecomment-2311928851 >, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHHKZGRAQGMK6YUFM4NYWL3ZTQ4KVAVCNFSM6AAAAABLPWBKT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJRHEZDQOBVGE https://github.com/notifications/unsubscribe-auth/AHHKZGRAQGMK6YUFM4NYWL3ZTQ4KVAVCNFSM6AAAAABLPWBKT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJRHEZDQOBVGE > . You are receiving this because you modified the open/close state.Message ID: @.***>
— Reply to this email directly, view it on GitHub <https://github.com/LauritsSkov/Introgression-detection/issues/8#issuecomment-2313479383 >, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AYHIA2CVQ4BUKZPQOCSXPTTZTTPKTAVCNFSM6AAAAABLPWBKT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJTGQ3TSMZYGM >. You are receiving this because you authored the thread.Message ID: @.***>
Hi! Thanks for the helpful archaic vcffiles. I find that the hg38 file individuals_highcov.2.bcf contain the site from chr1 like this :
Not only the chr2 file have this problem, but also the chr1 file contain other contigs' sites. I just want to know why? Does it will influence the result?
Best wishes. Jie