dchud / ddbench

Benchmarking suite for dedupe
Apache License 2.0
0 stars 0 forks source link

Reconsider report output format/location/strategy #3

Open dchud opened 8 years ago

dchud commented 8 years ago

Current version writes json output to files with common prefixes. We also discussed generating multiple sqlite dbs or one centralized db, and there are other options to consider, like something optimized for time series / event data.

If we roll ahead with the current strategy, we could also move all output runs from a single execution into their own folder, change the folder/file naming strategy, etc.

bbengfort commented 8 years ago

Since you involved Redis, we can use Redis for output storage if that's available; otherwise for sequential/multiprocessing I'm feeling the disk based method. JSON is fine, or TSV/CSV even might be ok.