Closed desmodus1984 closed 2 years ago
What version of bcftools are you using?
What is the output of htsfile Filt1_Adj_DP.vcf.gz
? It should say something like VCF version 4.2 BGZF-compressed variant calling data
. Try to use bcftools index
for the indexing step instead of tabix. If none of this helps, a small test file will be required to reproduce the problem.
Note you will not gain much with --threads 28
, threading is used only for the compression and decompression step.
I installed bcftools in conda, and it seems like the version is 1.8 while the latest one is 1.15.1. I tried again, this time, I did the first filtering step with bcftools. bcftools filter --threads 48 -S . -e 'FMT/DP<30 | FMT/GQ<30 | FMT/DP>120' PA113.vcf.gz -O z -o PA_113_f1.vcf.gzbcftools filter --threads 48 -S . -e 'FMT/DP<5 | FMT/GQ<30 | FMT/DP>24' No_Pa90-16.vcf.gz -O z -o No_Pa90-16_f1.vcf.gz
Let me ask you another question. I am having trouble figuring out which is the best operator to use for filtering, either | or || or & or &&. From the files PA113.vcf.gz No_Pa90-16.vcf.gz, I want to get filter variants that have GQ >=30, and 30<DP<120 and 5<DP <24 for the first and second vcf files respectively, but I want to get filter files that can be merged.
I could not use the htsfile command because the merge step didn't work.
I tried updating conda, and it got the following message after"conda install -c bioconda bcftools" The following packages will be UPDATED: bcftools: 1.8-h4da6232_3 bioconda --> 1.15.1-h0ea216a_0 bioconda ca-certificates: 2021.10.8-ha878542_0 conda-forge --> 2022.4.26-h06a4308_0 certifi: 2021.10.8-py39hf3d152e_2 conda-forge --> 2022.5.18.1-py39h06a4308_0 htslib: 1.9-h4da6232_3 bioconda --> 1.15.1-h9753748_0 bioconda python: 3.9.10-hc74c709_2_cpython conda-forge --> 3.9.10-h85951f9_2_cpython conda-forgeThe following packages will be DOWNGRADED: curl: 7.83.1-h2283fc2_0 conda-forge --> 7.82.0-h7f8727e_0 krb5: 1.19.3-h08a2579_0 conda-forge --> 1.19.2-hac12032_0 libcurl: 7.83.1-h2283fc2_0 conda-forge --> 7.82.0-h0b77cf5_0 libnghttp2: 1.47.0-he49606f_0 conda-forge --> 1.46.0-hce63b2e_0 libssh2: 1.10.0-ha35d2d1_2 conda-forge --> 1.10.0-h8f2d780_0 openssl: 3.0.3-h166bdaf_0 conda-forge --> 1.1.1o-h7f8727e_0 wget: 1.20.3-ha35d2d1_1 conda-forge --> 1.20.1-h20c2e04_0 will the downgrades produce an error? I kinda feel like they will. Should I do the filtering again with the newer version of bcftools?
Juan Pablo Aguilar Cabezas
Ecology and Evolutionary Biology Ph.D. Candidate
Department of Biological Sciences
Ohio University, Athens OH
From: Petr Danecek @.> Sent: Monday, June 13, 2022 9:14 AM To: samtools/bcftools @.> Cc: Aguilar Cabezas, Juan Pablo @.>; Author @.> Subject: [External] Re: [samtools/bcftools] Filter Doesn't work on merged Vcf file (Issue #1734)
Use caution with links and attachments.
What version of bcftools are you using?
What is the output of htsfile Filt1_Adj_DP.vcf.gz? It should say something like VCF version 4.2 BGZF-compressed variant calling data
Note you will not gain much with --threads 28, threading is used only for the compression and decompression step.
— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsamtools%2Fbcftools%2Fissues%2F1734%23issuecomment-1153973958&data=05%7C01%7Cja569116%40ohio.edu%7Ca5c1c86675954b4fdadb08da4d46fca6%7Cf3308007477c4a70888934611817c55a%7C0%7C0%7C637907264531592320%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7ZIEJeUMmloyL%2BtxEqxZpryTmFWNL7rUD%2FeHwNoDhxw%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJWD2VKTKU7CCZRWDBEHXE3VO463DANCNFSM5YQYOD7Q&data=05%7C01%7Cja569116%40ohio.edu%7Ca5c1c86675954b4fdadb08da4d46fca6%7Cf3308007477c4a70888934611817c55a%7C0%7C0%7C637907264531592320%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fVDHfsovQjqsnZmICSTc2IPIZWWs3%2FBHt0YZ4k3hgHI%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>
I did a new merged file using bcftools merge --threads 48 PA_113_f1.vcf.gz No_Pa90-16_f1.vcf.gz -Oz -o Adj_DP.vcf.gz and the files were filtering with Bcftools
bcftools filter --threads 48 -S . -e 'FMT/DP<30 | FMT/GQ<30 | FMT/DP>120' PA113.vcf.gz -O z -o PA_113_f1.vcf.gzbcftools filter --threads 48 -S . -e 'FMT/DP<5 | FMT/GQ<30 | FMT/DP>24' No_Pa90-16.vcf.gz -O z -o No_Pa90-16_f1.vcf.gz and the result of merging with the new version bcftools merge --threads 48 PA_113_f1.vcf.gz No_Pa90-16_f1.vcf.gz -Oz -o Adj_DP.vcf.gz
is the same, a log file with weird characters
‹ ÿ BC í[IsÜ8²>«Â>´û=¡Dp§ÆêjËrkžJ{ÛsQ @.åè?°Xi[ÅÜú"±°|¹!`2ñúõ”•0²¢úèã»Ó‡päÿôúõéÙùÝû›£·g'GWÇ··û' rÉjÍ?zu\–hÊJR¡šÅ«_zýúøüÎN¸¸¼¸¿yº=çj ¸Vˆòª…RlR¢†SÍÌs ¦I#=g •"§f²ß`è\<^7´ÜÆ?èKCK¦íøË›ñqËÏñÉþESM@Ýìß-j8:ãf {BAÉrT@ç M…DzHÂQ^>—ü)Ä¸í² ‰J¦ôR4O®:šä{4ëZŠ'VQmÑ¢%ŽÞ˜g…™ž£ñõ‘EHÈöç„È WˆJXŠ_vɸ~ùÀ…^Ô€®‡÷ánåVKÆgà »“Çg÷ÏÓ˜qV5:¹Bb¢@**@.**@.„ÒNFZ ÁQ.ôdϳ>ÏN¶|œƒ4Ÿ£†³/˜KEQ4cÀ‘¢U]z3i4âB#šK¡Ô²Yý‚rÁ9äZ! ¹ËEfçwTgR4uO‚óN€ß1ä…áµd_¡ØGWs V9-¡@%û %› @.***al ¡èÂÂÄûŸR̈‡ê7ë¹´,SèQ
ëqxû[Ç
ø=Ý€ÄË…’‹ª¸FJSÍ”f¹Bs–Ïm—d ¬:N™šƒüY¡÷O4×è”6‹¾ ¹™+MŒœ0ªF–¥ÇwÿÿNTåÅ9ã
y{'ª ãÜYíov¾ÚìA‹F×6ôMëh6zȧ£ÙW„qg£OA¿íâ7¿ßxv ½:&Älö²áу ©78ØúÍÇçaÚo_DCdPŽxˆX6ØJÈ gÙ
kH†ØMý¡Ölcìz]W xNk1M©Òaœ þ Rc-ð„À5eKP¢l¬çMi© a<‘@?ã å…ÂTãª)53 ‹)òÆŒ×ÆLa%LÌW )ê®·Û ã3L9ÚFµj<¯#N×Ìf\Hè„SXiµÁV¬ ̸ù@Ë5P» ×S ¦450¬VôiÝà“0 Ó “VS¸¦Ea±BµÀXƲ)ýqqvy±ÙÓ÷6á)/Åî®@Î ÿñøüÜR¦~ %+¬N°²Ûð|nÏÎßÜ!Œ˜h¥ðô# ǵ3 Já¦.ìAƒx#C§ŠNJÀ ¾4Æì¸
¹qr±Ab, TÃÒ—ñ„V˜ñž–ÍpwUDߘ¼Ô÷÷ º!› Æ +SÕ&äÍ©,¬ÎJàÌļn(-ŠIѪUp»Gv7=äS³LÃ%ã°bªMkã:ŸãI3‚D¡·ê²"ô²¡ß•ªpNó¹±rÇ¥bÚx/–e˜ìzçPÖ« Õ¦5Ô\<þΊøæ‰Ùèìâôa|ýÇÙû»Uw£ ÿY|ÆLKªAö:ßé˜åÊê\‚– ***@***.***¸¦ …6äÖB”†mJÝ.ÜîcåóC#‡<¸gü~¨›–¥xÄ¢,°¬ ËumÜf¹ënÌÀÕ´õjÿc«Ñ£WáÈÅ#òjÿ„j8z5¦ DÈ>ò=ßGÑ!ñÃ]чñÝ7wÂ[(!×—ks{/Üî[ï†WÇ÷„÷ëøÛnà˜Ó Ú-j#R“Ô(K=3X3«[Âkš
S 悯¼¨×Ù½¬:ì;ž|°º·çå®CB% 7¼QPàÕ¹zs€ N¹îfb-–Aì3@…d3Æi‰ézØî(Ö«¿^@É(ÇL”Ûq©ö»CúìêÀz.AÍEY 6$*«2lÎ>¢ÂSIs‹Óv.%îZ7v ¶Ú”å`Ž[1‹êX±¯;ÛHÅøf·çt6Ø Ð›60f5½ão±ŒZ€ïŒjÅ1X\ؽ–Ûèëî•žZx³ ®Qí‘¡¸ã¯\pñ ²\ºëfø]úÂz1©¦®í¶Z÷¸¦z¾UÀEÅrUL,c«å¸Z¶‘÷íƒÃÆd 'v/(
¶(°yãxš,jÍöu'í š9/¦ /ßÕ„ÜX¼Ÿþ>9ü}røNßÞ»ÿÙp@Ä[nÞÙa†þææmäkSˆïº÷êã¥Í;uÓæV±cßf=lææøün™LÚ_¥ÌÞ&©ÚÌ'®ÉŸî?-ÕƒÄOeëe‹Ó¼xN®ãNhZ¢eÄÓÍinž‹µ¶á£ ®o(ÿ|ÛT;„„ûV¹0 O)*ô‰•¹xIÊ?#ÕTH›t„˜¢ãR£j„nŠÌ{ç2•Âv‰¿$;û¤DÕ6(TÑšÓ@0²ÜÆvˆòU^'ÝIý}2‰=“_f¤ºý¬¼}.þoôýÅÉsD¸Õ¢î建¸¿ƒø”ƒR¿ƒþ±5¶r{µ‰á´«Í&‡:k€EDsÐ Å×ÅÌÆŒm¢§·ŽÔ›Ü[e¥6f¥¶©žñ‰0ûà;ÓéYXO@¹™Ár» ***@***.***»@***@***.***7lÏýNe±ÀŸ€ñ È‚§r½Î®øŸ¿^@Ó'›Y_s² ŠÞŒÏßÿ²úÒA7¢—BoLö˜ƒ± •l™9µQƒ*û|üî——†#Ç3"›‹Ó. >CÓ— rýã5t3¾Eãöðºõ½eä?‡§?‡ËphXµ¤ÿ°x}òcÒË7Sôn•Ã>XJƒ& tb"ä6èÍñ§ûñ5åÅ:èúßY²7ôÙ÷î7F1EêKce|m¿Ãi»áØHÜUµæ“Ñ€‚Í”7í+ÜqëuÚ~w´_Ó óQ2·îÛ²i«nÂŽ@‹+¡þ[Ö–VѺ·n/o~L÷vQU散ˢPèÆÈbùO¾ùð »c¼¦Fߎ–vìÌÒU9NEYÜW$»¯ïÉ}-Jy‹ýøLÏ|ß‹“0K’}ó5¸š”‹£Ý<Ã÷!ý>$‰Â(Lˆï @’8ò¼Ä2ìA?É2d‘+dÔ‡$^@/Í\!ã>¤çy~gȤ™‘—ÅîOýˆd!qEÌúˆ$ó“4N]‰×ƒLÒ4J£ÈY‘„ô!“8‰3ßYnâ÷!/NSwU’¾ïD©—fqà.x؇ ÓØHä õ!ý,Ž"ÏÝâñ d¥Yà9Cö}'ŒI%»àiÒÏ¢8}d6 éﻇô¾÷øñÒ$r6ß÷a˜zî\ö½'õBßóœUé.ùqè;#ö}'³0MÝ¥î»N˜%~ìîŒ~ßs‚8ÌH8#&}Ä($™û"÷Ó>¢—¦Yân™¬Ä^¹ŸZ¼>" ƒ8ŠŽki§/@8…^$Ä1@ ƒ0 Ã>bø qç1BL’ uFŒ½0ðÝWOÒG$‰£È1@$išyΈYÑ‹ƒ uæ1ìû ‰S3âÀF“Å$ˆ¥ý³0zbÐGŒ¢È‹Ý¥ƒ8Œ3gÄ×›0ô¼,vFŒƒ,rß¹Âd Ñ#é ¤Nûˆ^’E™»e²!Ä4œãcä †aJœ-‘Ä Š_€8p6 ***@***.***’ÈsìûKœÆQèÎáÀûL–¥$uì{K$™ç.rÖôc’8GˆØ ô½Ä=CÓw•( |÷\Wì÷ƒ, CgÀ I_ðÎ÷=%H<BgÀ¨ù~;¼Á0v?èć±8zÁI>8‹iô+ʼnÝObÉÀ;¿—„©3‡ÉÀ¦’FQâ8p ãÀ=¿“ Ãü Nœ e™Ã4s7ÊÀ!Ì ’p8”a&©{ 3È/'‘Ÿ:ã¥x$v_4Ù‘{ò»ï&i滧-SÒÇK÷“¼²ØÙ¾iÐÇ ½(pÆ8vy/8^§}‰ý„¸Ûw œe™;^ÒÇ‹_p\HÓ<’¸ãe}¼(H×sÖ÷ÐäáñHÏ3gy³M$Ê~t¦†òJB½[ e-¡Þ®oÛšVæ¦Ó„ÚJH…ýe*åg\4uiîÈêZL}inj¡×=Ÿ‚Qû9ØÖ’÷úqÍž \_\!ž©½›|QÒÌ4ÇËòñƒFTW¿ßzAüYÓœdix
>ÙlH¥•aS9Í5û ¸„(×…Çù"/E–ÞêF¦“[÷»ŒÜöZŽ-ÀÈ…³W21æ‹©Ê6µXaŠ0ÖUÝ–úèª>Pe#+]Õ#â…‰HÔU½VP0Sjî>«®6ÓW²'ô)øî([Ú3¡UWDx× :®%""ñaHƒ½?¹³5…ëÒÃhDFdÿ›ó£¹ÖõáÁÁŒéy3å¢:€'ªLAQåÍjY©üÈP?<88˜ª•KªóùJý´®¿˜ºÓ ÔÁî¢ûéõk%™ÃÖÕ»uëöUƒŸ^¿žäSSK©îÛB¶Žw2Jÿw®UÉ&˜ŒÒþ¸åÚ=j#lXäôóéøîàäêmà¡¿yþp½~>¹ú•øÞÏ_¢¯‹î®ÃtuÛlûòÃ?Õü-ÕèŸG„ ŸVñFé› šÊfø±vX'†ý±!…Ÿ"|iù:¹ºçð¸ÃÔšË qE3“xÝØñÚð–WùÞaF+^7.¶ßòúwÊ9.®º;ºšÔ¦’Ã^°5ß4‰Fd-«ùõc³#Ýþšší¿þB·—7¿æa|ý6ôÚ‡eíË[¶[~]6\Ÿ¼õÛ§íj™·8ûâVBµ¦ÞÖç)¹ßjXªn,–ª Œñ¡±·UÝ»ßo.Ç{W—·{g'{7ïO÷ŽÏïö®ÿ8>ßkuºg göÚÛ±{öŠÌž¹‡¹gn7î™ËŒ{æz¨ùï/ÈÞø"Ú»øÍ7C½ÌNðì_ÛâG{æ®äž¹ö¹g.^î} ~ú7&Wˆ=A ‹ ÿ BC
I was thinking on creating a brand new environment in conda for bcftools. Given that the Ohio Supercomputer Center has some weird policies for conda, and doesn't update conda itself, which version of conda do you recommend me to use to install bcftools?
Is there any reason why I get that weird log file?
I did the htsfile command with the new merged file, which was merged with bcftools 1.8, and the results was "fine"
DP_new.vcf.gz htsfile DP_new.vcf.gzDP_new.vcf.gz: VCF version 4.2 BGZF-compressed variant calling dataIs there any reaso why I can't do further filtering with it?
Thanks;
Juan Pablo Aguilar Cabezas
Ecology and Evolutionary Biology Ph.D. Candidate
Department of Biological Sciences
Ohio University, Athens OH
From: Petr Danecek @.> Sent: Monday, June 13, 2022 9:14 AM To: samtools/bcftools @.> Cc: Aguilar Cabezas, Juan Pablo @.>; Author @.> Subject: [External] Re: [samtools/bcftools] Filter Doesn't work on merged Vcf file (Issue #1734)
Use caution with links and attachments.
What version of bcftools are you using?
What is the output of htsfile Filt1_Adj_DP.vcf.gz? It should say something like VCF version 4.2 BGZF-compressed variant calling data
Note you will not gain much with --threads 28, threading is used only for the compression and decompression step.
— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsamtools%2Fbcftools%2Fissues%2F1734%23issuecomment-1153973958&data=05%7C01%7Cja569116%40ohio.edu%7Ca5c1c86675954b4fdadb08da4d46fca6%7Cf3308007477c4a70888934611817c55a%7C0%7C0%7C637907264531592320%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7ZIEJeUMmloyL%2BtxEqxZpryTmFWNL7rUD%2FeHwNoDhxw%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJWD2VKTKU7CCZRWDBEHXE3VO463DANCNFSM5YQYOD7Q&data=05%7C01%7Cja569116%40ohio.edu%7Ca5c1c86675954b4fdadb08da4d46fca6%7Cf3308007477c4a70888934611817c55a%7C0%7C0%7C637907264531592320%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fVDHfsovQjqsnZmICSTc2IPIZWWs3%2FBHt0YZ4k3hgHI%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>
It is not a log file but a compressed VCF requested by the -O z
option. Try to leave it out or pipe into zless
and it should start making more sense:
bcftools [any command] | less
bcftools [any command] -Oz | zless
Hi,
I am studying a population and I genotyped 17 individuals. Since one of them is the reference, so has more data, I decided to separate the reference gvcf from the others, and do a first read depth filtering separately, since the reference ind has 60X while the others have ~11X, hence, I want to follow the min 1/2-averDP - max 2x-averDP, and with altogether, the reference affects the magnitude of the min/max DP. So, I separated them with GATK SelectVariants, and then merged then with Bcftools merge:
bcftools merge --threads 28 PA_113_DP.vcf.gz No_Pa90-16_DP.vcf.gz -Oz -o Adj_DP.vcf.gz
To try to verify that it worked well, I used Vcftools to check for depthvcftools --gzvcf Adj_DP.vcf.gz --depth
and it worked well.No, I want to try to filter with
and first I got the error: "Failed to open Adj_DP.vcf.gz: could not load index" So I indexed it
tabix -p vcf Adj_DP.vcf.gz
and now, every time I try running the filter on the Adj_DP.vcf.gz it gets me weird unreadable log file "‹ ÿ BC Aí[Isܸ’>Ë¿aÚ=#”îÔ³zBmI¶Þh—lGÌE"³ªÐ& %•ãýø €ÅÚHÛzP¼[_ì–/ ‰dâÍ› +a"dEõÁç÷'áÈõæÍÉéÙÝñÍÁ»Ó£ƒ«ÃÛÛÝ#P¹dµf‚¼>,K4a¥©PM•‚âõ¯Þ¼9<»³../îoŽO6çÜ@-A× Q>GµPŠK@ÔÀpªÙƒù]‚iÒHϘB¥È©™lÁ×:× -7ñÏÄ#úÖÐ’éy;þòæü°åçðh÷¢©Æ nvïæ5œr S=¡ d9 Ö3…&B"=$a‚(/ŸþbÜv Y€D%Sz¡€5šGWMò3šu-Å«¨6„hÑGoÍo…™ž¡óë?ŠíŸcZ 3!a±Pü¾MþÃõsÈ .ô¼t=¬¸w[(·Z2>Ùž|~zqÿ<-œ3Ϊ¦BGWHŒÈ(¬°5øüþK‘Ý&qõ,¯fsÅrZ¢zFãS4£uÙÊÍx»ñ™à»¨°“Æv„x´¤»Í Ë¥7Z78P˜] ¡´“‘Hp@”=ÙãóôÈ…ÏÓ£Mg Íg¨áì[fÄBQMÙp¤hU—€ÞŽ¸ÐˆæR(µhV¿£\p¹VHB.äb“ÙùÕ©MÝ“à¬àÃOòÂðZ²ïP좫™„«œ–P ’}…’Í„(ZÚ.öŒBT¡&Œ·ú4:7«jÈÙ„›ÿ/·ÏÙTWqhôVÏk£Ùrn)ÔB1»jbbÿž0©4z ’Q®;6èžUÝ<Ϭ>ñ\ðÂ’ ¥q ç°”Ú,„¦mÏE…Q„Y EçÆ&Þ•bJ<T¿]Í¥e‰˜BRðiÃÛ?;ßé$^l”\TµàÀ5Ršj¦4Ëzœ±|f»$SÕqÂÔäo ?Ñ\£;PÚlú4äf®4>r̨Y–>Þýï{QU”gŒƒåí½¨ÆŒƒ1gµ»Þùz½a,]7ÚÐ7£éè!ŸŒ¦ßÆÝ} úmúýÆÓ‹``èÕ!!þ
³— I½ÁÁ~Ôo><Ó~ëùE4ÔH切ˆeƒ„r– ¶†dˆÝÔjÍÖÆ®öu5W€g´ÓÑ„*Mƹà 5Ö©\S&±%ÊÆZÞ„– Æc ô+S^(L5®šR3㨰˜ aÌxÝhÌVÂxÁ|9±¢îq{2>Å”s¡×PË¡ÆbðÊãtÍlÊ…„N8…•¦RÑhÅ ÀŒk´\µr5¥`JS£ÃjEŸVIs there a reason, Vcftools can generate a reasonable coherent depth analysis, while Bcftools fails? Previously, that same code for filtering worked but not I do not understand that weird log file.
Best;