In-Network-Machine-Learning / Planter

Planter is a modular framework for realising in one-click in-network machine learning algorithms.
Apache License 2.0
17 stars 4 forks source link

Planter

Planter Logo License GitHub release

Introducing Planter

Planter is a modular framework for realizing one-click in-network machine learning algorithms. All you need to provide to Planter are a configuration file (Name_data.py) and a dataset. Planter will take it from there and offload your machine learning classification task into a programmable data plane. This is the artifact for the paper "Planter: Rapid Prototyping of In-Network Machine Learning Inference" in SIGCOMM CCR (with early arXiv version named "Automating In-Network Machine Learning".)

💡 Please check Planter's user manual PDF (strongly recommended).

Setting up the Planter environment

Planter requires python3+, with the packages listed in requirements_pip3.txt. To install the aforementioned packages, start your working environment and execute the following command:

pip3 install -r ./src/configs/requirements_pip3.txt

Some packages need to be installed using sudo python3:

sudo pip3 install -r ./src/configs/requirements_sudo_pip3.txt

Getting started with Planter

First, prepare a working environment as described in the previous section Link.

Run the following command to start Planter in manual configuration mode.

python3 Planter.py -m

Use help (-h) to see additional command options, e.g., -t, and -d.

💡 A detailed getting started tutorial Link (strongly recommended).

Planter Supports

The Planter Workflow

Planter Framework

Model Trainer & Converter:

Model training, model conversion (table generation), and (python based) checking of generated table entries, are all done by a table_generator.py file. The file is located under the src/models folder, within each model_name/type_n folder. For example, /src/models/RF/Type_EB. All the table entries, registers, and constants needed in the pipeline are generated by this file, which is unique per machine learning model mapping.

P4 Generator:

Planter supports multiple target architectures architectures. Architectures are located under src/architectures/. In each architecture's folder, there is a file called p4_generator.py that manages and calls all the files required to generate code MODEL.usecase_datasetp4 for a switch model.

Specifically, two essential files are called. The first one is common_p4.py (located under the use-case folder src/use_cases/<use_case_name>. This file includes the common P4 code (What is P4?) that is used by this use case. The second file is dedicate_p4.py under the model folder ./src/models/<model_name>/<type_n>. The essential p4 code required by the chosen model is stored in this file.

Model Compiler & Tester:

A generated p4 file will be compiled and loaded to the target using dedicated targets scripts under /src/targets/<target_name>. The compiler will further load the generated M/A table entries (or registers) to the target device. The model tester (test_model.py under the same folder) will send packets to the target to verify its functionality.

Simple Test Guide:

Getting Start Tutorial: (💡strongly recommend)

A detailed tutorial wiki, provides an example step-by-step guide of running a sample RF model on BMv2.

Output Matrix:

Generally, each test will have three classification performance matrix outputs. Each matrix has a similar structure, as shown below:

               precision    recall  f1-score   support

           0     1.0000    1.0000    1.0000        13
           1     0.9500    0.9500    0.9500        20
           2     0.9167    0.9167    0.9167        12

    accuracy                         0.9556        45
   macro avg     0.9556    0.9556    0.9556        45
weighted avg     0.9556    0.9556    0.9556        45

A more detailed matrix is shown in the sample tutorial wiki

Performance Mode:

To test Planter in performance mode, configure the model by using the following table. The table shows what mapping (EB, LB, or DM) can be used in each type of model when choosing the use case performance. As manual optimization is applied, the mappings supported by performance mode consume fewer stages but only match the performance use case.

Use case DT RF XGB IF SVM NB KM KNN PCA AE NN
Mapping EB & DM EB & DM EB & DM EB LB LB LB & EB EB LB LB DM

Throughput Test:

To test Planter's mapped models' throughput on a P4Pi-enabled BMv2, follow the throughput test wiki.

Adding Your Own Design

Reporting a Bug

Please submit an issue with the appropriate label on Github.

License

The files are licensed under Apache License: LICENSE. The text of the license can also be found in the LICENSE file.

Applications

Please access the Planter project's history and recent applications through the link. If your work uses Planter, please kindly email us if possible. We will include your latest publication or project in the application list.

Citation

If you use this code, please cite our paper:

@article{zheng2024automating,
  title={{Planter: Rapid Prototyping of In-Network Machine Learning Inference}},
  author={Zheng, Changgang and Zang, Mingyuan and Hong, Xinpeng and Perreault, Liam and Bensoussane, Riyad and Vargaftik, Shay and Ben-Itzhak, Yaniv and Zilberman, Noa},
  journal={ACM SIGCOMM Computer Communication Review},
  year={2024}
}

Acknowledgments

The following people contributed to this project: Changgang Zheng, Mingyuan Zang, Xinpeng Hong, Liam Perreault, Riyad Bensoussane, Shay Vargaftik, Yaniv Ben-Itzhak, and Noa Zilberman. In addition, Peng Qian contributed to this repository. This work was partly funded by VMware and the EU Horizon SMARTEDGE (101092908, UKRI 10056403). We acknowledge support from Intel and NVIDIA.