Open yaohaojiao opened 3 years ago
Supplement:
I try again by this way but get the same WARNING,and i not found the 1264 positions
when i grep my vcf file .
Chr1 126419 Chr1 126420 Chr1 126421 Chr1 126424 Chr1 126425 Chr1 126426 Chr1 126431 Chr1 126434 Chr1 126442 Chr1 126444 Chr1 126447 Chr1 126451 Chr1 126453 Chr1 126461 Chr1 126464 Chr1 1264021 Chr1 1264025 Chr1 1264042 Chr1 1264047 Chr1 1264050 Chr1 1264052 ……
@terhorst Could you please help me with this issue? Many thanks!
I am having the same warning:
smcpp.commands.vcf2smc WARNING Multiple entries found at 1126 positions; skipped all but the first
I checked both my vcf file and my bed file for repeated positions like this:
zcat my.vcd.gz | grep -v "#" | grep "Chr01" | cut -f 2 | sort | uniq -D
zcat MY.bed.gz | grep "Chr01" | cut -f 3 | sort | uniq -D
zcat MY.bed.gz | grep "Chr01" | cut -f 3 | sort | uniq -D
And I didin't get any possition.
My bed file looks like this: Chr01 0 398 Chr01 503 613 Chr01 710 753 Chr01 837 975 Chr01 1104 1488 Chr01 1623 2061
and my vcf is a phased vcf file, phased with Beagle.
I am running the program like this:
smc++ vcf2smc -d ind1 ind2 -m my.bed.gz myvcf.vcf.gz chr1.smc.gz Chr01 ANN1:ind1,ind2,id3,ind4,ind5,ind6,ind7,ind8,ind9,ind10
If I run the program like this
smc++ vcf2smc -d ind1 ind2 myvcf.vcf.gz chr1.smc.gz Chr01 ANN1:ind1,ind2,id3,ind4,ind5,ind6,ind7,ind8,ind9,ind10
without the bed file, I don't get the warning.
If I do this on my bed file
bedtools intersect -a my.bed.gz -b my.bed.gz -c > overlap.txt
and this:
cut -f 4 overlap.txt | sort | uniq
I got only the value 1, which means there are no overlapping positions.
Does anyone know if I am missing something?
The smc++ version I am running is:
SMC++ v1.15.5.dev14+g6779fae
您发的邮件我已收到,谢谢!
Hi,i am running vcf2smc and get a WARNING :
$ docker run --rm -v $PWD:/mnt terhorst/smcpp:latest vcf2smc -m Oc_genome.mask.bed.gz -d gaoqiao_14 gaoqiao_14 remove_multiple_POS-2.vcf.gz ./test/test.GQ14_chr1.smc.gz Chr1 GQ:gaoqiao_11,gaoqiao_13,gaoqiao_14,gaoqiao_17,gaoqiao_18,gaoqiao_20,gaoqiao_21,gaoqiao_23,gaoqiao_3,gaoqiao_4,gaoqiao_8 --core 48
Is that mean my vcf file contain multiple POS on one snp? i don't know why because i ‘ve used the
bcftools norm -d none
to deal with it . And if i remove option-m Oc_genome.mask.bed.gz
,no warning will appear whether I usebcftools
or not!I don't think there is a problem with mask file checked.How can I solve this problem?
Thank you!