epi2me-labs / wf-human-variation

Other
87 stars 41 forks source link

process sv:variantCall:sortVCF generating vcf with incomplete header #122

Closed Fazulur closed 6 months ago

Fazulur commented 7 months ago

Dear Team,

I attempted to run the nf-core v1.9.0 wf-human-variation pipeline with our dataset. However, the process 'phasing:phase_all' consistently fails, presenting the following error.

[E::vcf_hdr_read] No sample line Failed to read from SV/sv.vcf.gz: could not parse header

The root cause of this issue is from an incomplete header in the VCF. In the hg38 reference fasta file, there are additional contigs beyond chr1-22, X, Y, & M. The 'vcfsort' function appends the header up to 1000 lines and then skips the rest. This results in a missing 'CHROM' header line in the VCF.

Here are the few lines of vcf with an incomplete header. CHROM header line is missing in the vcf.

994 ##contig= 995 ##contig= 996 ##contig= 997 ##contig= 998 ##contig= 999 ##contig= 1000 ##contig= 1001 chr1 66239 Sniffles2.DEL.15F7S0 N 58 PASS PRECISE;SVTYPE=DEL;SVLEN=-37;END=66276;SUPPORT=29;RNAMES=133bffcd-9aa2-47f5-aefe-ca6d44bbd8ac,008f06ff-5f0c-43c2 -b06a-ab14999c5be9,07da6777-45f3-4d75-ac77-2f7ed41cb674,c35b0780-6ada-4e86-b387-503e461e5768,71e18242-dcfa-49b2-8b56-d3f122990745,ed6d1b1a-708b-41c3-b7d2-913ed73d8caf,1510c14d-abae-434c-a1e7-e b0768e757c7,7226dd03-1eef-4335-a24e-15c832e32dbb,ed2eb663-98cc-4e6f-a1b7-8002a31153d0,8a4f161b-bf95-4e4e-acdb-886777c2e223,0d8f0147-e5b9-404e-ae67-895d2307be7b,29de1341-a278-48c9-9a0e-eceecee8 3aa1,e07e1837-c349-4e7c-8b89-d55497eb7497,73213a33-354c-49b1-bdbb-4dae3e7c30b8,cc5832e8-6d9c-4af2-9be6-152c3a5c4fc3,03b4452f-4bf3-4032-9d1e-84aff0b6c05b,1ad0e9a9-0309-4446-8a88-7d8acbaf8ea3,1a a78d1d-c91c-48ee-a54f-9c183f8520b7,0f78e1a9-83e0-46c4-9f32-f820abb85e27,6997cf2c-fb30-438e-9eed-68e0a90c59e3,d89c4eab-85bc-4ace-876f-9a1f351c4235,90f8d402-bdc5-4868-942a-eb0546ddff79,2b9880a4- a1ef-4525-90bc-12b466d336bb,45708f72-545f-4b32-8a91-298395cf309c,8df5582e-6a50-4636-b03a-324253752b11,9731ae25-51fa-5883-8e4b-9e2a19abb55f,e70c2339-961e-4266-9445-70b7d81fac27,1f5f9d5c-39e1-4b fa-add8-f7e2bce73438,4c66def7-9fb9-49b7-a8bc-2465615d5a8d;COVERAGE=29,31,33,34,31;STRAND=+-;AF=0.879;PHASE=2,61045,29,29,PASS,PASS;STDEV_LEN=0;STDEV_POS=0 GT:GQ:DR:DV 1/1:40:4:29

I sorted sniffles filtered vcf using bcftools sort and it is working without issues.

Could you please assist in resolving this issue.

Thanks In Advance Fazulur Rehaman

RenzoTale88 commented 7 months ago

@Fazulur this is probably the same bug as #118. We are currently looking into this.

Fazulur commented 7 months ago

@Fazulur this is probably the same bug as #118. We are currently looking into this.

Thank you. I will wait for the fix.

Thanks In advance Fazulur Rehaman

RenzoTale88 commented 6 months ago

@Fazulur could you please try using v1.9.1?

Fazulur commented 6 months ago

Dear @RenzoTale88 ,

Thanks for the update. I will try.

Thanks & Regards Fazulur Rehaman

SamStudio8 commented 6 months ago

Closing as this fix has been confirmed independently by https://github.com/epi2me-labs/wf-human-variation/issues/118