ornl-oxford / genben

Benchmarking of software frameworks, and systems for storage and compute over large-scale genomic data.
MIT License
2 stars 3 forks source link

Add Configuration Options to PSV/CSV Output and Add InfluxDB Output Option #47

Closed eauel closed 5 years ago

eauel commented 5 years ago

This pull request adds configuration options for CSV-format file output. Previously, this was hard-coded to always output pipe-separated-value files (*.psv), but this PR adds a configuration option to change the file delimiter so that CSV files can be created as well. Output to a csv file can also be enabled or disabled now. Both of these settings are located under the [output.csv] section of the config file.

I have also added the ability to output benchmarking results to an InfluxDB server. I actually added this feature because I am considering using something like telegraf to collect some additional periodic metrics (memory usage, disk usage, etc.) during some upcoming benchmark testing, and telegraf also stores its data using InfluxDB. My thinking was that storing benchmark results in InfluxDB may make it easier to aggregate/analyze data later so that I can look into profiling different characteristics of the system running the benchmark, like memory usage for example. New configuration options for this module can be found under the [output.influxdb] configuration section, and configurable parameters include:

One other change involves using the datetime.utcnow() function instead of time.time() to fetch the current wall-clock times during benchmarking. This was done to ensure consistency when running on different operating systems (e.g. some systems include leap seconds in time since epoch, others may not). This change also results in UTC time being stored instead of local time.