opencb / hpg-bigdata

This repository implements converters and tools for working with NGS data in HPC or Hadoop cluster
Apache License 2.0
17 stars 14 forks source link

Compress the output file (variant query command line) #111

Open jtarraga opened 8 years ago

jtarraga commented 8 years ago

Currently, the results from the variant query command line are stored using the default compression in format Avro, Parquet or JSON. The command line should allow users to select the compression method. The output filename extension indicates the output format and compression. So, "gzip" for deflate/zip compression, "snz" for snappy compression.

Some examples: ./build/bin/hpg-bigdata-local.sh variant query --i test.avro --o test.json.snz .... ./build/bin/hpg-bigdata-local.sh variant query --i test.avro --o test.parquet.gzip ...