FedShop: The Federated Shop Benchmark

FedShop is a synthetic RDF Federated Benchmark designed for scalability. It evaluates the performance of SPARQL federated-query engines such as FedX, CostFed, Semagrow, Splendid, HeFQUIN, etc, when the number of federation members grows. FedShop is built around an e-commerce scenario with online hops and rating Web sites as in BSBM. Compared to BSBM, each shop and rating site of FedShop has its own SPARQL endpoint and shares a standard catalogue of products. Following the BSBM idea, the FedShop queries simulate a user navigating the federation of shops as if it were a single virtual shop. The scale factor corresponds to the number of shops and rating sites within the federation. Hence, with the FedShop benchmark, we can observe the performance of federated queries when the number of federation members increases.

FedShop consists of three components:

the FedShop data generator to generate data,
the FedShop query generator to instantiate template queries, along with a Reference Source Assignment (RSA) queries ie. FedShop queries expressed as SPARQL 1.1 queries with service clauses,
the FedShop runner to evaluate federated-query engines with FedShop queries on FedShop data.

QuickStart and Documentation

The quickstart guide is available in the Quickstart tutorial
How to configure FedShop and how to extend FedShop is available in the wiki

FedShop has been published in ISWC2023 as a resource paper:

Dang, M. H., Aimonier-Davat, J., Molli, P., Hartig, O., Skaf-Molli, H., & Le Crom, Y. (2023, October). FedShop: A Benchmark for Testing the Scalability of SPARQL Federation Engines. In International Semantic Web Conference (pp. 285-301). Cham: Springer Nature Switzerland.

https://hal.science/hal-04180506/document

@inproceedings{dang2023fedshop,
title={FedShop: A Benchmark for Testing the Scalability of SPARQL Federation Engines},
author={Dang, Minh-Hoang and Aimonier-Davat, Julien and Molli, Pascal and Hartig, Olaf and Skaf-Molli, Hala and Le Crom, Yotlan},
booktitle={International Semantic Web Conference},
pages={285--301},
year={2023},
organization={Springer}
}

FedShop200 Datasets and Queries

FedShop200 is a basic set of datasets and queries generated with FedShop. It contains 120 SPARQL queries and datasets to populate a federation of up to 200 endpoints. It is available at .

Instead of downloading the complete archive, you can also download only individual parts of FedShop:

fedshop-dataset.zip: All the quads for the 200 federation members.
fedshop-virtuoso.db: The final Virtuoso database with all 200 federation members.
fedshop-batches.zip: nq files for each federation (20, 40, 60, 80, etc..).
fedshop-workload.txt: The FedShop Workload, i.e., all queries in a single text file.
fedshop-service.txt: RSA Fedshop Workload, i.e. the Reference Source Assignement for FedShop queries as SPARQL 1.1 queries with Service clauses.
eval-model.zip: results obtained after Evaluation process. These are results for every engines, on every queries instances per batch.
gen-model.zip: results obtained after Generation process. These are instanciated queries for every batches, for every instances.

FedShop200 Results

A first evaluation of existing SPARQL Federation engines on FedShop200 computed by the FedShop runner is available through a Jupyter Notebook:

Jupyter Evaluation

FedShop Data Generator

The FedShop Data Generator is defined as three WatDiv template models in experiments/bsbm/model. These models follow the BSBM specification as closely as possible. Using WatDiv models allows changing the schema easily through the configuration file experiments/bsbm/config.yaml.

Most of the parameters of FedShop are set in experiments/bsbm/config.yaml. It includes the number of products to generate, the number of vendors and rating sites.

Basic statistics about the default configuration of FedShop are available in the jupyter notebook

Generate Datasets and Queries

Once config.yaml properly set, you can launch the generation of the FedShop benchmark with the following command:

python rsfb/benchmark.py generate data|queries experiments/bsbm/config.yaml  [OPTIONS]

OPTIONS:
--clean [benchmark|metrics|instances][+db]: clean the benchmark|metrics|instances then (optional) destroy all database containers
--touch : mark a phase as "terminated" so snakemake would not rerun it.

Such a process is very long and complex. All the artifacts produced during generation is created under experiment/bsbm. Datasets are created under experiments/bsbm/model/dataset, and queries under experiments/bsbm/benchmark/generation.

The overall workflow for FedShop generation is as follows:

Create the catalog of products (200000 by default)
Batch(0)= Create 10 autonomous vendors and 10 autonomous rating-sites sharing products from the catalog (products are replicated with local URL per vendors and rating sites). The distribution law can be controled with parameters declared in experiments/bsbm/config.yaml
Workload= Instantiate the 12 template queries with 10 different random place-holders, such that each query return results.
Compute the optimal source assignment of each of the 120 queries of the Workload on Batch(0)
For i from 1 to 9
- Batch(i)=Batch(i-1)+10 new vendors and 10 rating-sites
- Compute the Reference Source Assignment (RSA) for each query of the Workload over Batch(i)

We finished this process with a federation of 200 different federation members. This overall workflow can be changed thanks to parameters declared in experiments/bsbm/config.yaml

Please note:

The workflow is managed with the Snakemake workflow management system. It allows the creation of reproducible and scalable data analyses. The snakemake files are located in experiments/bsbm/*.smk.
The generation of queries and the computation of optimal source assignments requires Virtuoso
The dataset generation is realized with many calls to Watdiv. WatDiv is marginally updated and is available here. It has been integrated into this github repository as a submodule.

Evaluate federated-query engines using FedShop Runner

As the number of federation members can be high, having a SPARQL endpoint per federation member becomes hard. We ingested all shops and rating-sites over a single Virtuoso server as Virtual Endpoints,i.e., each shop and rating-site has its own Virtual SPARQL endpoint. The different configurations relative to Batch(i) are available to configure a given federated-query engine. It is possible at this stage to run all FedShop Benchmark with Kobe. However, we also provide a benchmark runner based on Snakemake that is convenient for managing failures during the execution of the benchmark.

Federated-query engines must implement a template to be integrated in the evaluation workflow. Many templates are already written in rsfb/engines/. Once integrated, the engine to be tested must be declared in experiments/bsbm/config.yaml to run.

The following command allows to launch the evaluation:

python rsfb/benchmark.py evaluate experiments/bsbm/config.yaml --rerun-incomplete [OPTIONS]

OPTIONS:
--clean [benchmark|metrics|instances][+db]: clean the benchmark|metrics|instances then (optional) destroy all database containers
--touch : mark a phase as "terminated" so Snakemake would not rerun it.

This launches the evaluation the FedShop workload over the different federations Batch(i) with the federated-query engines declared in experiments/bsbm/config.yaml. As for the generation, this process is long and complex and is managed by Snakemake. The evaluation rules are declared in experiments/bsbm/evaluate.smk. All the results are produced under experiments/bsbm/benchmark/evaluation.

Our jupyter notebook is already written to read results and computes the diverse metrics.

Benchmark your engine:

Load our [basic model]() and mark both the generation and evaluation phases as "completed":

python rsfb/benchmark.py generate data|queries experiments/bsbm/config.yaml --touch
python rsfb/benchmark.py evaluate experiments/bsbm/config.yaml --touch

cd engines
git submodule add <link_to_your_repo>

Update config.yaml and provide key/value pair if needed:

evaluation:
  n_attempts: 4
  timeout: 600
  engines:
    fedx:
      dir: "engines/FedX/target"
    ...
    <your_engine>:
      <keyN>: <valueN>

Make <your_engine>.py:

cd rsfb/engines/
cp TemplateEngine.py <your_engine>.py

Implement every function within <your_engine>.py.

Use evaluate command to benchmark your engine:


python rsfb/benchmark.py evaluate experiments/bsbm/config.yaml --clean metrics


- Compare to other engines using our Jupyter Notebook.

## Most used commands:

```bash
# Remove Snakemake log directory
rm -rf .snakemake

# Continue the workflow if interrupted 
python rsfb/benchmark.py generate|evaluate experiments/bsbm/config.yaml --rerun-incomplete

# Delete everything and restart
python rsfb/benchmark.py generate|evaluate experiments/bsbm/config.yaml --rerun-incomplete --clean all

# Keep the data but remove the intermediary artefacts and db containers.
python rsfb/benchmark.py generate data|queries experiments/bsbm/config.yaml --rerun-incomplete --clean benchmark+db

# Only remove the metrics files, applicable when you need to rerun some of the steps
python rsfb/benchmark.py generate data|queries experiments/bsbm/config.yaml --rerun-incomplete --clean metrics

FedShop Contributors

Minh-Hoang DANG (Nantes University)
Pascal Molli (Nantes University)
Hala Skaf (Nantes University)
Olaf Hartig (Linköping University)
Julien Aimonier-Davat (Nantes University)
Yotlan LE CROM (Nantes University)
Matthieu Gicquel (Nantes University)

GDD-Nantes / FedShop

readme