ornl-oxford / genben

Benchmarking of software frameworks, and systems for storage and compute over large-scale genomic data.
MIT License
2 stars 3 forks source link

Multiple Data Set Concatenation #49

Closed eauel closed 5 years ago

eauel commented 5 years ago

This PR adds the ability to concatenate multiple Zarr data sets when benchmarking and when using Dask arrays. This is useful for combining multiple chromosome data sets into a single data set, for example.

To use this feature, specify a value of (*) for benchmark_dataset in the configuration file. This will cause the benchmark program to concatenate all data sets in the ./data/zarr directory.