Closed guosongwang closed 5 years ago
Hi Guosong,
Could you please send the full output printed before the segmentation fault?
Po-Ru
Hi Po-Ru,
Thanks for your help again! I have attached the script I used, including some preparation I made from Plink format to the input of eagle, hope you might spot some mistakes I didn't realize, and also the full output I got.
Thanks again for your precious time!
Eagle_gw.zip Guosong
Hi Guosong,
Apologies for the long delay. Unfortunately the output doesn't tell me much. Might you be able to share the input files? I'll be happy to track down the bug if so.
Po-Ru
On Mon, Jul 15, 2019 at 2:51 PM Guosong Wang notifications@github.com wrote:
Hi Po-Ru,
Thanks for your help again! I have attached the script I used, including some preparation I made from Plink format to the input of eagle, hope you might spot some mistakes I didn't realize, and also the full output I got.
Thanks again for your precious time!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_poruloh_Eagle_issues_25-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DACDGGQNAAKWJVPIFVQJFGOTP7TBKJA5CNFSM4IB7P7WKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ6T7ZA-23issuecomment-2D511524836&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=pBlP9cuNT3BUDsRDLkVbgWxzX8j_R7x0nL_Q4UWFMPo&m=hRV0RaWvB3b5fSfgsrgziXd7gWjO1XY9VTukcWk6yvI&s=CKgB1Hsn879HnqHFwVhkAJJdpyA1M2A0uZj5SCr12CA&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ACDGGQN2YPMP43XYVNRHTH3P7TBKJANCNFSM4IB7P7WA&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=pBlP9cuNT3BUDsRDLkVbgWxzX8j_R7x0nL_Q4UWFMPo&m=hRV0RaWvB3b5fSfgsrgziXd7gWjO1XY9VTukcWk6yvI&s=fZ7U_YeR151UGleir8M6dApI5kHpL-tnlhF6j0VGZII&e= .
Hi Po-Ru,
Thanks for offering the help. I did eventually resolve the problem after I switched to use bgzip instead of the bcftools to convert the VCF file to BCF file. Maybe you can put this into the manual as a suggestion.
Again, I appreciate your time on this.
Guosong
I can confirm that this appears to be an issue with index files created by bcftools specifically. Reindexing a bcf.gz file (even one created by bcftools) using tabix (rather than bcftools index -t
) cures the segfault problem. It seems this is most likely a bug with bcftools, but it could be a problem with eagle.
@guosongwang The Eagle_test.sh
script suggests you tried to use bcftools index --tbi
to index a BCF file. This cannot work, only .csi
indexes are supported.
What version of bcftools are you using by the way? Recent versions refuse to do that.
@pd3 bcftools 1.8, which is the second most recent release, supports both csi and tbi and there's nothing in the changelog to suggest that the index command was changed in the most recent version. So I think you're mistaken, but I could be misunderstanding
I am using bcftools-1.9.
It's weird to see that after I switched to bgzip to convert the vcf file, I kept using bcftools to index it and it worked fine right now.
My commands go like this:
bgzip -c /data/guosong.wang/Founders_HD_chr1.phase.vcf > /data/guosong.wang/Founders_HD_chr1.phase.bcf
bcftools index --tbi -f /data/guosong.wang/Fouders_HD_chr1.phase.bcf
And I have done it the same way for the target vcf file needs to be phased.
@guosongwang that is strange, I suppose I was wrong about it being the index function specifically. So for both of us, we had problems when using bcftools to both convert to bcf and index, but no problems when using a different program to do one of those two tasks... I think it's safe to assume this probably isn't a bug in eagle, though what's going on exactly is a mystery
Although bcftools 1.8 may not complain about indexing a BCF using a TBI, it does not mean it's supported. For BCF you have to use CSI, for bgzipped VCF you can use TBI or CSI.
I am sure I am not mistaken, I am the main bcftools developer :-)
@rborder I agree. I don't recall Eagle has specifically recommended which software to use in the manual, but since it is recommending converting it to BCF for reference-based phasing, most people will think of going to bcftools immediately. I tried bgzip only because when I tried to phase it in VCF, the program notified me that my files aren't bgziped. So at least before we get to the reason why it is causing the problem, it is safe to suggest this possible situation. Maybe this is a problem @pd3 suggested, I did try to index BCF with TBI. I can have a try and post my update later.
@guosongwang your command is incorrect. The difference between VCF and BCF is not the bgzip compression, but that the former is text and the latter a binary represenation. You can have plain text (VCF), compressed text (VCF.gz), uncompressed binary and compressed binary (BCF). If you want to convert between these, use bcftools view -O
, for example as
bcftools view -Ob -o out.bcf in.vcf # output is compressed BCF
bcftools view -Oz -o out.vcf.gz in.bcf # compressed VCF
bcftools view -Ov -o out.vcf in.bcf # uncompressed VCF
bcftools view -Ou -o out.vcf in.bcf # uncompressed BCF
The documentation can be found here: short tutorials http://samtools.github.io/bcftools/howtos/index.html and the man page http://samtools.github.io/bcftools/bcftools.html
EDIT: above I see you are working with bgzip compressed VCF (identical to bcftools view -Oz
) and name it BCF, that's why the TBI indexing works.
@pd3 I ended up with the same problem as @guosongwang when converting an input vcf.gz file to compressed bcf using precisely bcftools view -Ob
, followed by bcftools index -t
applied to the output. using tabix to reindex resolved the seg fault problem. So where the problem lies is a total mystery to me!
Okay, I think that the problem is that bcftools index is willing to produce a tbi index for bcf file. So it's a user error on our parts, but would perhaps be easier to diagnose if bcftools produced a warning when generating an unusable index? @pd3 let me know if I'm still missing something :)
@pd3 Thanks for pointing it out. I gotta admit I was confused by the compressed/uncompressed vcf/bcf file format ;).
@rborder The latest version doesn't let you index BCF with TBI
bcftools index --tbi file.bcf
[E::bcf_index_build3] TBI indices for BCF files are not supported
Dear author,
Thanks for your fantastic work.
I am wondering if you could give any suggestions about an error I encountered when I tried to phase with reference panel.
I am working with cattle and my data had 38 founders that were genotyped at HD density (~770K SNPs) and more than 1500 animals genotyped at 50K density. My ultimate goal is to impute chromosome 1 to HD resolution. I am following a workflow using Eagle and Minimac for such purpose.
I phased the 38 founders without reference panel to make it as a reference panel for the 1500 animals who are their offsprings. I made bcf files for the founders and unphased bcf file for the 1500 animals and I called:
The program seems to be running at beginning, but ended up with a Segmentation fault (core dump) without telling me more details, the output looks like this:
Can you give me any suggestions about what went wrong?
Thanks for your time! Truly appreciate in advance!
Thanks, Guosong Wang