GDD-Nantes / FedShop

Code for FedShop: The Federated Shop Benchmark
GNU General Public License v3.0
8 stars 0 forks source link

FedShop: The Federated Shop Benchmark

DOI

FedShop is a synthetic RDF Federated Benchmark designed for scalability. It evaluates the performance of SPARQL federated-query engines such as FedX, CostFed, Semagrow, Splendid, HeFQUIN, etc, when the number of federation members grows. FedShop is built around an e-commerce scenario with online hops and rating Web sites as in BSBM. Compared to BSBM, each shop and rating site of FedShop has its own SPARQL endpoint and shares a standard catalogue of products. Following the BSBM idea, the FedShop queries simulate a user navigating the federation of shops as if it were a single virtual shop. The scale factor corresponds to the number of shops and rating sites within the federation. Hence, with the FedShop benchmark, we can observe the performance of federated queries when the number of federation members increases.

FedShop consists of three components:

QuickStart and Documentation

FedShop has been published in ISWC2023 as a resource paper:

FedShop200 Datasets and Queries

FedShop200 is a basic set of datasets and queries generated with FedShop. It contains 120 SPARQL queries and datasets to populate a federation of up to 200 endpoints. It is available at DOI.

Instead of downloading the complete archive, you can also download only individual parts of FedShop:

FedShop200 Results

A first evaluation of existing SPARQL Federation engines on FedShop200 computed by the FedShop runner is available through a Jupyter Notebook:

FedShop Data Generator

The FedShop Data Generator is defined as three WatDiv template models in experiments/bsbm/model. These models follow the BSBM specification as closely as possible. Using WatDiv models allows changing the schema easily through the configuration file experiments/bsbm/config.yaml.

Most of the parameters of FedShop are set in experiments/bsbm/config.yaml. It includes the number of products to generate, the number of vendors and rating sites.

Basic statistics about the default configuration of FedShop are available in the jupyter notebook

Generate Datasets and Queries

Once config.yaml properly set, you can launch the generation of the FedShop benchmark with the following command:

python rsfb/benchmark.py generate data|queries experiments/bsbm/config.yaml  [OPTIONS]

OPTIONS:
--clean [benchmark|metrics|instances][+db]: clean the benchmark|metrics|instances then (optional) destroy all database containers
--touch : mark a phase as "terminated" so snakemake would not rerun it.

Such a process is very long and complex. All the artifacts produced during generation is created under experiment/bsbm. Datasets are created under experiments/bsbm/model/dataset, and queries under experiments/bsbm/benchmark/generation.

The overall workflow for FedShop generation is as follows:

We finished this process with a federation of 200 different federation members. This overall workflow can be changed thanks to parameters declared in experiments/bsbm/config.yaml

Please note:

Evaluate federated-query engines using FedShop Runner

As the number of federation members can be high, having a SPARQL endpoint per federation member becomes hard. We ingested all shops and rating-sites over a single Virtuoso server as Virtual Endpoints,i.e., each shop and rating-site has its own Virtual SPARQL endpoint. The different configurations relative to Batch(i) are available to configure a given federated-query engine. It is possible at this stage to run all FedShop Benchmark with Kobe. However, we also provide a benchmark runner based on Snakemake that is convenient for managing failures during the execution of the benchmark.

Federated-query engines must implement a template to be integrated in the evaluation workflow. Many templates are already written in rsfb/engines/. Once integrated, the engine to be tested must be declared in experiments/bsbm/config.yaml to run.

The following command allows to launch the evaluation:

python rsfb/benchmark.py evaluate experiments/bsbm/config.yaml --rerun-incomplete [OPTIONS]

OPTIONS:
--clean [benchmark|metrics|instances][+db]: clean the benchmark|metrics|instances then (optional) destroy all database containers
--touch : mark a phase as "terminated" so Snakemake would not rerun it.

This launches the evaluation the FedShop workload over the different federations Batch(i) with the federated-query engines declared in experiments/bsbm/config.yaml. As for the generation, this process is long and complex and is managed by Snakemake. The evaluation rules are declared in experiments/bsbm/evaluate.smk. All the results are produced under experiments/bsbm/benchmark/evaluation.

Our jupyter notebook is already written to read results and computes the diverse metrics.

Benchmark your engine:

evaluation:
  n_attempts: 4
  timeout: 600
  engines:
    fedx:
      dir: "engines/FedX/target"
    ...
    <your_engine>:
      <keyN>: <valueN>

- Compare to other engines using our Jupyter Notebook.

## Most used commands:

```bash
# Remove Snakemake log directory
rm -rf .snakemake

# Continue the workflow if interrupted 
python rsfb/benchmark.py generate|evaluate experiments/bsbm/config.yaml --rerun-incomplete

# Delete everything and restart
python rsfb/benchmark.py generate|evaluate experiments/bsbm/config.yaml --rerun-incomplete --clean all

# Keep the data but remove the intermediary artefacts and db containers.
python rsfb/benchmark.py generate data|queries experiments/bsbm/config.yaml --rerun-incomplete --clean benchmark+db

# Only remove the metrics files, applicable when you need to rerun some of the steps
python rsfb/benchmark.py generate data|queries experiments/bsbm/config.yaml --rerun-incomplete --clean metrics

FedShop Contributors