Closed konopinski closed 10 months ago
Hi Maciek,
This is because Stacks will eliminate all the genotyping information when GT is missing, I have updated a new version to handle this issue, please have a try.
Best, Yulong
Thank you, Yulong. DP problem is fixed. But I have two new ones.
VCF_filter_multi.pl --i populations.snps.vcf --o multi.filtered.1.vcf --P ../../popmap2023.txt --m 10 --M 200 --Q 20 --g 0.01 --f --T 21
vcf
produced by your script there's another problem:
../VCF_filter/VCF_filter_multi.pl --i multi.filtered.vcf --o multi.filtered.1.vcf --P ../../popmap2023.txt --m 10 --M 200 --Q 20 --g 0.01 --f --T 21
================================================================================= Using 21 threads, 999 per batch...
CMD: /media/sf_Dane/RADSeq/Barbus/new/R/allDataStacks/../VCF_filter/VCF_filter_multi.pl --i multi.filtered.vcf --o multi.filtered.1.vcf --P ../../popmap2023.txt --m 10 --M 200 --Q 20 --g 0.01 --f --T 21
Parameters used:
File name : multi.filtered.vcf
Out name : multi.filtered.1.vcf
MinQ : 20
minDepth : 10
MaxDP : 200
Global MAF : 0.01
Only keep polymorphic sites
Only keep bi-allelic sites
Remove Indels
done. Use of uninitialized value $tot_indiv in concatenation (.) or string at ../VCF_filter/VCF_filter_multi.pl line 255. =================================================================================Use of uninitialized value $tot_indiv in concatenation (.) or string at ../VCF_filter/VCF_filter_multi.pl line 258.
Final retained : 0 SNPs of individuals Total 0 SNPs
Sorry for troubling you - I think you did a great job writing this script. You might consider writing a short 'User manual' because not all options are intuitive, e.g. it is not certain if `--H`,`--c` or `--l` filters out the whole snp if it exceeds the provided value in a single population, or how Fis filtering works. I guess you wrote it because you needed it for some particular project and it is great you want to share it with the world, but such brief explanation would be very helpful (consider also writing a short note somewhere so that the package is easier to cite). Thanks a lot.
Maciek
Hi Maciek,
Stacks do not output SNP quality in VCF, so do not set filter on MinQ. I guess your popmap file is not the correct file used for VCF_filter. I have uploaded the sample files, please have a look. I will update a User manual in the near future.
Best, Yulong
manual added.
Hi, Multithread vcf filter is a great idea! Thanks for your work. I have encountered a problem while trying it. I have a vcf file produced by Stacks pipeline. I tried to filter it with
VCF_filter
but I received an error saying:Thread # terminated abnormally: DP field is requred
. I'm sure there is such field in the vcf file. Do you have any idea what could it mean? Below you will find a header and a few lines of genotypes.issue.zip
Cheers, Maciek