MRCIEU / gwas2vcf

Convert GWAS summary statistics to VCF
MIT License
47 stars 18 forks source link

bgzip or gzip? #22

Closed explodecomputer closed 5 years ago

explodecomputer commented 5 years ago

I think at the moment it detects the output type based on suffix of the filename provided, and automatically gzips? Is there anyway to make it bgzip instead? pysam seems to have problems reading a .vcf.gz gzipped file

e.g. /mnt/storage/private/mrcieu/research/mr-eve/gwas-files/IEU-a:2/

>>> bcf_in = VariantFile("IEU-a:2.vcf.gz")
[W::hts_idx_load2] The index file is older than the data file: IEU-a:2.vcf.gz.tbi
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pysam/libcbcf.pyx", line 3990, in pysam.libcbcf.VariantFile.__init__
  File "pysam/libcbcf.pyx", line 4238, in pysam.libcbcf.VariantFile.open
  File "pysam/libchtslib.pyx", line 517, in pysam.libchtslib.HTSFile.tell
NotImplementedError: seek not implemented in files compressed by method 1
explodecomputer commented 5 years ago

@mcgml

mcgml commented 5 years ago

Works OK for me. The error suggest the index was made before the VCF? Did you copy without preserving timestamp?

mcgml commented 5 years ago

PS - the filename colons may not be supported on Windows

explodecomputer commented 5 years ago

Ah, yeah I must have done something to it thanks. I was wondering if colon would be a problem.. as we are changing the mr base IDs we could also change the rest of them at the same time.. e.g. IEU-a-2

mcgml commented 5 years ago

@explodecomputer Thanks, will do :https://github.com/MRCIEU/mr-base-api/issues/116