world-federation-of-advertisers / cardinality_estimation_evaluation_framework

Evaluation framework and methods for estimating cardinalities of groups of sets
Apache License 2.0
21 stars 9 forks source link

Modify the modules in the evaluations folder to support more simulation parameters. #46

Closed huangxichen1 closed 4 years ago

huangxichen1 commented 4 years ago
  1. Modify the sketch estimator config so that it can support both sketch epsilon and estimate epsilon.
  2. Move the SketchEstimatorConfig from simulator to configs.
  3. Modify the interoperability test to make sure all run_evaluation.py is well tested.
huangxichen1 commented 4 years ago

Example bash command for running the cardinality estimator evaluation:

OUT_DIR="temp_output"
NUM_RUNS=2

python3 src/evaluations/run_evaluation.py \
--evaluation_out_dir="$OUT_DIR" \
--analysis_out_dir="$OUT_DIR" \
--report_out_dir="$OUT_DIR" \
--evaluation_config="complete_test_with_selected_parameters" \
--sketch_estimator_configs='log_bloom_filter-1e5-first_moment_log-1.0986-infty' \
--sketch_estimator_configs='log_bloom_filter-1e5-first_moment_log-0.2747-infty' \
--sketch_estimator_configs='log_bloom_filter-1e5-first_moment_log-0.1099-infty' \
--sketch_estimator_configs='log_bloom_filter-1e5-first_moment_log-infty-1.0986' \
--sketch_estimator_configs='exp_bloom_filter-1e5_10-first_moment_exp-1.0986-infty' \
--sketch_estimator_configs='exp_bloom_filter-1e5_10-first_moment_exp-0.2747-infty' \
--sketch_estimator_configs='exp_bloom_filter-1e5_10-first_moment_exp-0.1099-infty' \
--sketch_estimator_configs='exp_bloom_filter-1e5_10-first_moment_exp-infty-1.0986' \
--sketch_estimator_configs='vector_of_counts-4096-sequential-1.0986-infty' \
--sketch_estimator_configs='vector_of_counts-4096-sequential-0.2747-infty' \
--sketch_estimator_configs='vector_of_counts-4096-sequential-0.1099-infty' \
--sketch_estimator_configs='vector_of_counts-4096-sequential-infty-1.0986' \
--sketch_estimator_configs='geo_bloom_filter-1e4_0.0012-first_moment_geo-1.0986-infty' \
--sketch_estimator_configs='geo_bloom_filter-1e4_0.0012-first_moment_geo-0.2747-infty' \
--sketch_estimator_configs='geo_bloom_filter-1e4_0.0012-first_moment_geo-0.1099-infty' \
--sketch_estimator_configs='geo_bloom_filter-1e4_0.0012-first_moment_geo-infty-1.0986' \
--evaluation_run_name="results" \
--num_runs=$NUM_RUNS
huangxichen1 commented 4 years ago

I will close this PR as it is too complicated to split into small PRs each of which does a specific task.