facebook / CacheLib

Pluggable in-process caching engine to build and scale high performance services
https://www.cachelib.org
Apache License 2.0
1.18k stars 254 forks source link

(PR2) Adds the support for Flexible Data Placement(FDP) over NVMe into the Cachelib #277

Closed arungeorge83 closed 7 months ago

arungeorge83 commented 9 months ago

This adds the device layer support for NVMe-FDP semantics and adds the RUH-awareness feature of NVMe-FDP in the upper layers of Navy. This allows the BlockCache(large items) and BigHash(small items) of Navy to segregate their data streams in physical NAND media by using the FDP placement Identifiers.

With this changes, the Cachelib can reduce the Device Write Amplification (WAF) significantly even in high SSD utilization scenarios("nvmCacheSizeMB" above 50% of the SSD capacity) in most of the cachelib workloads.

This commit introduces a 'placementHandle' concept for data placement, which can be used by both BC and BH of Navy on device write() calls, especially for FDP placements. The 'placementHandle' have to be allocated from the device.

io_uring_cmd interface(through nvme char device) is used to send FDP directives to Linux kernel, as sending it through the conventional block interfaces is not suported yet. The user can select the NVMe block device (Namespace/partition) as usual (Ex: "nvmCachePaths": ["/dev/nvme0n1p1"]), and the cachelib will pick the corresponding NVMe char device internally.

This commit adds a new config 'fdpMode' to enable FDP. The user needs to select the fdpMode along with iOUring I/O Engine options ("navyEnableIoUring": true, "navyQDepth": 1, "fdpMode": true).

This second PR consist of the code which can be integrated to the 'iouring async-io' support in the Navy.

arungeorge83 commented 9 months ago

Can you check if build is OK with removing liburing package?

FdpNvme.cpp is not protected for this case. I guess you want to bring them under -DCACHELIB_IOURING_DISABLE, right?

arungeorge83 commented 9 months ago

Can you check if this compiles in supported platforms here https://github.com/facebook/CacheLib/actions?

As per the status, 'This workflow is awaiting approval from a maintainer in https://github.com/facebook/CacheLib/pull/277' Could you check it and approve? Or should I verify it locally?

jaesoo-fb commented 9 months ago

As per the status, 'This workflow is awaiting approval from a maintainer in #277' Could you check it and approve? Or should I verify it locally?

Oh, my bad. I just started the workflows. I expect some of them to be failed, so please fix it as it appears.

facebook-github-bot commented 9 months ago

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 9 months ago

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 8 months ago

@arungeorge83 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot commented 8 months ago

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

arungeorge83 commented 8 months ago

Overall looks good. Thanks for the work.

Could you apply below patch?

diff --git a/fbcode/cachelib/navy/common/TARGETS b/fbcode/cachelib/navy/common/TARGETS
--- a/fbcode/cachelib/navy/common/TARGETS
+++ b/fbcode/cachelib/navy/common/TARGETS
@@ -7,6 +7,7 @@
     srcs = [
         "Buffer.cpp",
         "Device.cpp",
+        "FdpNvme.cpp",
         "Hash.cpp",
         "SizeDistribution.cpp",
         "Types.cpp",
@@ -15,6 +16,7 @@
         "Buffer.h",
         "CompilerUtils.h",
         "Device.h",
+        "FdpNvme.h",
         "Hash.h",
         "NavyThread.h",
         "SizeDistribution.h",
@@ -27,7 +29,6 @@
         "//folly:function",
         "//folly:thread_local",
         "//folly/experimental/io:async_io",
-        "//folly/experimental/io:io_uring",
         "//folly/hash:checksum",
         "//folly/hash:hash",
         "//folly/io/async:event_base_manager",
@@ -41,6 +42,8 @@
         "//folly:file",
         "//folly:portability",
         "//folly:range",
+        "//folly/experimental/io:async_base",
+        "//folly/experimental/io:io_uring",
         "//folly/fibers:core_manager",
         "//folly/fibers:fiber_manager_map",
         "//folly/fibers:timed_mutex",
@@ -50,4 +53,7 @@
         "//folly/lang:bits",
         "//folly/logging:logging",
     ],
+    exported_external_deps = [
+        ("liburing", None, "uring"),
+    ],
 )

I could not find the file TARGETS. Is it something not present in open source version?

jaesoo-fb commented 8 months ago

I could not find the file TARGETS. Is it something not present in open source version?

Yeah, this is for internal build. Please ignore.

facebook-github-bot commented 8 months ago

@arungeorge83 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot commented 8 months ago

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 7 months ago

@jaesoo-fb merged this pull request in facebook/CacheLib@009e89ba2b49b1fbbc48d03c3f81046de28bd6ed.

jaesoo-fb commented 7 months ago

Submitted with the fixup https://github.com/facebook/CacheLib/commit/b5d70a5f372f8b2950905e27c4adddf85c4bece7