Closed anubhabkhan closed 5 years ago
I would in that case rather write your own little script that converts the VCF to a multihetsep file. The only issue is the masks. You need to assume something for the regions between the segregating sites. Assuming them all to be called homozygous reference may not be appropriate depending on your coverage.
Stephan
On 21 Dec 2017, at 05:08, anubhabkhan notifications@github.com wrote:
Hi,
I already have a multi sample vcf. I filtered all the SNPs for genotype quality and base quality 30. Can I use this as an input for MSMC? I am splitting the vcf to generate several files per chromosome and per individual. Can I use these directly for generate_multihetsep step?
Thanks Anubhab
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc-tools/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbQmp4I3bEB5tNJaX5e4UUf-DQUl52Fks5tCdmygaJpZM4RJVL3.
I mostly have average sequencing depths of 24-30X depth. Is there a way to create just the mask files so that the process is quick?
With regards and yours sincerely
Anubhab Research Scholar, National Centre for Biological Sciences, Tata Institute of Fundamental Research, India
On 08-Jan-2018, at 2:36 PM, Stephan Schiffels notifications@github.com wrote:
I would in that case rather write your own little script that converts the VCF to a multihetsep file. The only issue is the masks. You need to assume something for the regions between the segregating sites. Assuming them all to be called homozygous reference may not be appropriate depending on your coverage.
Stephan
On 21 Dec 2017, at 05:08, anubhabkhan notifications@github.com wrote:
Hi,
I already have a multi sample vcf. I filtered all the SNPs for genotype quality and base quality 30. Can I use this as an input for MSMC? I am splitting the vcf to generate several files per chromosome and per individual. Can I use these directly for generate_multihetsep step?
Thanks Anubhab
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc-tools/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbQmp4I3bEB5tNJaX5e4UUf-DQUl52Fks5tCdmygaJpZM4RJVL3.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc-tools/issues/18#issuecomment-355913025, or mute the thread https://github.com/notifications/unsubscribe-auth/AWn7MYaGDlLJnt-KQfGDGjOHFcdgzTj-ks5tIdp5gaJpZM4RJVL3.
Yes, by using my bamCaller script on the bam files. But you could also generate masks using some different rule set, e.g. by saying: I include all sites at which 90% of individuals have a high quality genotype called, or something like that. But you would definitely need to go to the bam level. If you have a multi-sample VCF that only contains segregating sites, you have lost information on non-segregating sites.
Stephan
On 8 Jan 2018, at 11:35, anubhabkhan notifications@github.com wrote:
I mostly have average sequencing depths of 24-30X depth. Is there a way to create just the mask files so that the process is quick?
With regards and yours sincerely
Anubhab Research Scholar, National Centre for Biological Sciences, Tata Institute of Fundamental Research, India
On 08-Jan-2018, at 2:36 PM, Stephan Schiffels notifications@github.com wrote:
I would in that case rather write your own little script that converts the VCF to a multihetsep file. The only issue is the masks. You need to assume something for the regions between the segregating sites. Assuming them all to be called homozygous reference may not be appropriate depending on your coverage.
Stephan
On 21 Dec 2017, at 05:08, anubhabkhan notifications@github.com wrote:
Hi,
I already have a multi sample vcf. I filtered all the SNPs for genotype quality and base quality 30. Can I use this as an input for MSMC? I am splitting the vcf to generate several files per chromosome and per individual. Can I use these directly for generate_multihetsep step?
Thanks Anubhab
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc-tools/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbQmp4I3bEB5tNJaX5e4UUf-DQUl52Fks5tCdmygaJpZM4RJVL3.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc-tools/issues/18#issuecomment-355913025, or mute the thread https://github.com/notifications/unsubscribe-auth/AWn7MYaGDlLJnt-KQfGDGjOHFcdgzTj-ks5tIdp5gaJpZM4RJVL3.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc-tools/issues/18#issuecomment-355931599, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbQmpfVCM6I5ft5X6GYjtqZgUw5UgoSks5tIe90gaJpZM4RJVL3.
Hi,
I already have a multi sample vcf. I filtered all the SNPs for genotype quality and base quality 30. Can I use this as an input for MSMC? I am splitting the vcf to generate several files per chromosome and per individual. Can I use these directly for generate_multihetsep step?
Thanks Anubhab