Closed Nekel-Seyew closed 7 years ago
Yes, that's the right idea. Two small tweaks before you get too far into the evaluation.
1 - For clarity with the rest of the examples:
rename dir makeflow_maker
-> benchmark
rename program make.py
-> make_benchmark
add a brief README.md to explain what's going on here.
2 - Suggest that the output file not be the sum of the shared and unique input files, but just mirror the unique input to the unique output. That reflects the common pattern of having a single large cachable reference file, but relatively small inputs and outputs.
These tools will create a workflow DAG based off of the input parameters. You can customize having output or not, how long to busy run the program, specify input sizes both unique and common, whether or not to read those inputs, etc. Hopefully this will be a useful tool for testing workflow systems.