rick-heig / nvme_csd

A Software Framework for Hardware Agnostic NVMe Computational Storage Devices
11 stars 0 forks source link

NVMe Computational Storage Devices based on Open-Source Software and readily available Hardware

This project aims to allow anyone to create their own Non Volatile Memory Express (NVMe) Computational Storage Devices with readily available hardware platforms and open-source software.

View our paper (open-access) for background, technical details, benchmarks, and applications.

What is computational storage ?

Computational Storage is a new processing paradigm, a new architecture, where storage devices that are capable performing computations are employed.

Computational Storage

Computational Storage Devices (CSDs) allow to reduce the data bottleneck to the main processor. They also allow to scale processing capabilities with the amount of data.

Computational Storage Scaling

CSDs can even perform computations during the main processor's downtime. They can be used for lengthy tasks such as video transcoding or AI model training.

Our approach to computational storage

Our computation storage devices are based on the Non-Volatile Memory Express (NVMe) open standard and present themselves as such to a computer. They can be used as a traditional NVMe SSD connected over PCIe.

We build them around hardware that is capable of running Linux and implement our NVMe firmware within. This allows for a very large range of hardware, we already support multiple platforms with different capabilities and hardware accelerators (e.g., GPU, FPGA, NPU, Video CODECs, etc.).

CSD Anatomy

Computational functions can be activated through NVMe custom commands as well as TCP/IP tunneled over NVMe. This allows for easy integration and use of CSDs in already existing architectures.

List of supported hardware platforms

Our goal is to support a wide variety of readily available hardware in order to lower the barrier of entry to CSD development as much as possible. We have made it possible to run our CSD firmware on platforms ranging from expensive FPGA SoCs to cheap (sub 100$) off-the-shelf single board computers.

Our vision

See the list of supported hardware in the platforms README.

NVME CSD Firmware

A Portable Linux-based Firmware for NVMe Computational Storage Devices - Paper

This project provides an open-source firmware to build hardware agnostic NVMe computational storage devices.

The firmware is implemented as a Linux PCI endpoint function driver. https://docs.kernel.org/PCI/endpoint/pci-endpoint.html

This allows the firmware to run on any target hardware that supports Linux and provides a PCI endpoint controller driver.

The NVMe CSD firmware is based on a Linux NVMe PCI endpoint function under development which is based on an initial RFC by Alan Mikhak https://lwn.net/Articles/804369/

Citation

@article{wertenbroek2024portable,
  title={A Portable Linux-based Firmware for NVMe Computational Storage Devices},
  author={Wertenbroek, Rick and Thoma, Yann and Dassatti, Alberto},
  journal={ACM Transactions on Storage},
  year={2024},
  publisher={ACM New York, NY}
}

Directory structure

Running the CSD on a platform

1) Follow the instructions in the chosen platform directory to build the kernel and RootFS and prepare the platform. 2) Connect the platform to the host computer via PCIe. 3) Start the CSD driver on the given platform with the following command :

sudo nvme-epf-script -q <number of queues> -l <backend storage block device> --threads <number of transfer threads> start

For example with a SATA SSD/HDD or USB flash drive attached as /dev/sda, or /dev/ram0 for a Linux RAM disk block device :

sudo nvme-epf-script -q 4 -l /dev/sda --threads 4 start

4) Turn on the host computer. 5) Verify that it recognized by the host computer. For this we recommend the NVMe command line utility tool https://github.com/linux-nvme/nvme-cli which can be installed through most package managers, e.g., on Ubuntu with sudo apt install nvme-cli.

sudo nvme list

Which will list all NVMe drives, e.g., a Samsung 970 and the CSD, listed as "Linux pci_epf".

Node                  SN                   Model                                    Namespace Usage                      Format           FW Rev  
--------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1          S4EWNM0TA08899W      Samsung SSD 970 EVO Plus 1TB             1         580.54  GB /   1.00  TB    512   B +  0 B   2B2QEXM7
/dev/nvme1n1          0df8d0659e3ecf8c2a94 Linux pci_epf                            1         500.11  GB / 500.11  GB    512   B +  0 B   6.5.0-rc

6) The CSD can be used as a normal disk. 7) For the computational capabilities and demos check the README in the host directory. 8) For SSH and adding the CSD to the host network check the host/socket_relay directory.

Information for developers

The global architecture is presented in the diagram below :

NVMe CSD diagram

We will follow the diagram from top to bottom

This serves as a very light overview of the code, if you are interested in development and have questions, please open an issue on github to start a discussion.