Parallel Computation Model

This PR introduces parallel computation for the Evaluator object in evaluator.py. The parallelization is handled by pathos, a Python multiprocessing package that uses 'dill' for serialization instead of Python's default 'pickle'.

Testing 'vector_of_counts-4096-ln3-sequential' on 'smoke_test' on a Pixelbook:
- Without the (#15) fix:
- Running serially on a single core takes ~170 seconds
- Running parallel on two cores (hyperthreaded to 4 virtual cores) takes ~130 seconds
- With the (#15) fix:
- Running serially on a single core takes ~2.5 seconds
- Running parallel on two cores (hyperthreaded to 4 virtual cores) takes ~1.5 seconds
dill is required due to local functions that we want to pass into our simulations. pickle cannot serialize this type of data.
tqdm is introduced for dynamic loading bars and time estimates during parallelization.
tqdm calculates wall time estimates and outputs actual wall time spent. The 'evaluation_run_time' file inside the evaluation output folder for each estimator should produce approximately the same results whether run parallel or serially. This is by design to allow easy comparison while debugging estimators. The evaluation_run_time when run parallel will sum up the time spent on all cores to measure CPU time, not wall time.
Added non-required parallel_cores argument to run_evaluation.py to specify how many cores to use while running evaluations. If not specified, will use all available CPU cores.

world-federation-of-advertisers / cardinality_estimation_evaluation_framework

Parallel Computation Model #17