samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
640 stars 241 forks source link

bcftools reheader to support stdin? #2088

Closed clintval closed 4 months ago

clintval commented 5 months ago

I need to add reference sequences and their lengths from an FAI file and came across bcftools reheader. I can get it to work when reading from a file but not from standard input. Would it be possible to add this support so I can add this command in a pipestream without having to write a temporary VCF to disk?

$  gzip -dc test.vcf.gz | bcftools reheader /dev/stdin --fai reference.fa.fai --output /dev/stdout
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Failed to read the header: /dev/stdin

Whereas the same command works fine when reading from a normal file:

$ bcftools reheader test.vcf.gz --fai reference.fa.fai --output /dev/stdout
... <success!>
daviesrob commented 5 months ago

If you use - instead of /dev/stdin as the input file, reheader actually admits that the --fai option doesn't work with stdin:

$ gzip -dc reheader_in.vcf.gz | ./bcftools reheader --fai reheader.fai  --output reheader_out.vcf -
Cannot use the --fai option when reading from standard input.

This is because the --fai option currently causes the file to be opened twice, which is not compatible with reading a pipe. I think it should be possible to make it keep the file handle opened by the --fai option, and pass it on for use by the reheader part. It doesn't look too hard for BCF files. VCF and VCF.gz might be a bit more tricky, but I hope I should be able to make them work.

clintval commented 5 months ago

I had a feeling it should be possible to stream this functionality, but I didn't know what would be involved! Thanks for checking! For now, I'm writing a temp file and cleaning it up after, but would desire to avoid writing to disk large intermediate VCFs when not needed. I appreciate your attention to this issue!