ExaScience / elprep

elPrep: a high-performance tool for analyzing sequence alignment/map files in sequencing pipelines.
Other
286 stars 42 forks source link

invalid BGZF file: does not end in proper EOF marker #74

Open lucy-1n-the-ai opened 1 month ago

lucy-1n-the-ai commented 1 month ago

Howdy,

I am running the following command to genotype a short read to a reference and I get the "invalid BGZF file: does not end in proper EOF marker" error. I haven't edited any of the files since they were made from aligning with vg giraffe so I'm unsure how to fix this.

elprep sfm ERR11223848.bam ERR11223848_geno.bam \ --nr-of-threads 48 \ --mark-duplicates \ --mark-optical-duplicates metrics/ERR11223848.metrics.txt \ --sorting-order coordinate \ --reference Olympia_test.elfasta \ --haplotypecaller vcf/ERR11223848.gvcf.gz \ --intermediate-files-output-type sam --tmp-path ./

caherzee commented 1 month ago

Hi,

I am guessing that the input bam file is corrupted. Can you try to convert ERR11223848.bam to .sam format, for example with SAMTools and see if that produces an error?

You can also try with elprep: elprep sfm ERR11223848.bam ERR11223848.sam and see if that produces the same error?

Thanks!