openbmc / dbus-sensors

D-Bus configurable sensor scanning applications
Apache License 2.0
23 stars 44 forks source link

NVMe sensors reading issue #18

Closed operezmuena closed 2 years ago

operezmuena commented 2 years ago

I've recently begun testing NVMe sensors and I've been having temperature reading issues with my Intel SSD P4510 devices. Debugging the NVMe code, I noticed that the "i2c_smbus_read_block_data()" function is returning a size of 6 but, no sensor data is found in the "resp" vector.

size = i2c_smbus_read_block_data(fileHandle.handle(), cmd, resp.data());

Only after I changed 'resp' from being a std::vector type to a std::array type, I began seeing valid temperature readings from my SSD drives. Here is my patch for your reference:

From c72120055b73f0a328464f971c60e6ad0760e449 Mon Sep 17 00:00:00 2001
From: Oscar A Perez <oscar.perez@ztsystems.com>
Date: Wed, 30 Mar 2022 19:40:23 +0000
Subject: [PATCH] NVMe temperature readings not working

---
 src/NVMeBasicContext.cpp | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/NVMeBasicContext.cpp b/src/NVMeBasicContext.cpp
index 08333c59..ef264f8e 100644
--- a/src/NVMeBasicContext.cpp
+++ b/src/NVMeBasicContext.cpp
@@ -16,6 +16,7 @@
 #include <cstring>
 #include <system_error>
 #include <thread>
+#include <span>

 extern "C"
 {
@@ -59,8 +60,9 @@ static void decodeBasicQuery(const std::array<uint8_t, 6>& req, int& bus,
     offset = req[sizeof(busle) + 1];
 }

+template<std::size_t SIZE>
 static ssize_t execBasicQuery(int bus, uint8_t addr, uint8_t cmd,
-                              std::vector<uint8_t>& resp)
+                              std::array<uint8_t, SIZE>& resp)
 {
     int32_t size = 0;
     std::filesystem::path devpath = "/dev/i2c-" + std::to_string(bus);
@@ -77,8 +79,6 @@ static ssize_t execBasicQuery(int bus, uint8_t addr, uint8_t cmd,
         return -errno;
     }

-    resp.reserve(UINT8_MAX + 1);
-
     /* Issue the NVMe MI basic command */
     size = i2c_smbus_read_block_data(fileHandle.handle(), cmd, resp.data());
     if (size < 0)
@@ -101,7 +101,7 @@ static ssize_t execBasicQuery(int bus, uint8_t addr, uint8_t cmd,

 static ssize_t processBasicQueryStream(FileHandle& in, FileHandle& out)
 {
-    std::vector<uint8_t> resp{};
+    std::array<uint8_t, UINT8_MAX + 1> resp{};
     ssize_t rc = 0;

     while (true)
@@ -164,11 +164,11 @@ static ssize_t processBasicQueryStream(FileHandle& in, FileHandle& out)
         }

         /* Write out the response data */
-        std::vector<uint8_t>::iterator cursor = resp.begin();
-        while (cursor != resp.end())
+        std::span<uint8_t> mySpan{resp.data(), len};
+        for (auto it = mySpan.begin(); it != mySpan.end(); )
         {
-            size_t lenRemaining = std::distance(cursor, resp.end());
-            ssize_t egress = ::write(out.handle(), &(*cursor), lenRemaining);
+            size_t lenRemaining = std::distance(it, mySpan.end());
+            ssize_t egress = ::write(out.handle(), &(*it), lenRemaining);
             if (egress == -1)
             {
                 std::cerr << "Failed to write block data of length " << std::dec
@@ -180,8 +180,7 @@ static ssize_t processBasicQueryStream(FileHandle& in, FileHandle& out)
                 }
                 return -EIO;
             }
-
-            cursor += egress;
+            it += egress;
         }
     }

@@ -398,6 +397,16 @@ void NVMeBasicContext::processResponse(std::shared_ptr<NVMeSensor>& sensor,

     uint8_t* messageData = static_cast<uint8_t*>(msg);

+#if DEBUG
+    std::cerr << "\nmessageData: ";
+    for (size_t i=0; i < len; i++)
+    {
+        std::cerr << std::hex << std::setfill('0') << std::setw(2)
+                  << static_cast<int>(messageData[i]) << " ";
+    }
+    std::cerr << "\n";
+#endif
+
     uint8_t status = messageData[0];
     if ((status & NVME_MI_BASIC_SFLGS_DRIVE_NOT_READY) ||
         !(status & NVME_MI_BASIC_SFLGS_DRIVE_FUNCTIONAL))
--
2.17.1
amboar commented 2 years ago

Ah, I think there's an issue with using resp.reserve() here:

https://github.com/openbmc/dbus-sensors/blob/master/src/NVMeBasicContext.cpp#L80

The size of the response is returned but the size of the vector is never changed. It's not invalid from a memory-safety standpoint, just the vector never changes to a size that's appropriate to fetch the values out (I've run it under valgrind and poked around but haven't caused any issues). Further, just adding a resize() before the return will overwrite the values just written by the call to i2c_smbus_read_block_data() so we'll have to do something more. It's odd that I'm not seeing it fail but you are, because it seems there's definitely a bug there.

I'd prefer we stick with a vector rather than use a templated array as this allows execBasicQuery() to abstract the response sizes away for the caller. @operezmuena can you try this patch?

diff --git a/src/NVMeBasicContext.cpp b/src/NVMeBasicContext.cpp
index 08333c591081..5708ad843c00 100644
--- a/src/NVMeBasicContext.cpp
+++ b/src/NVMeBasicContext.cpp
@@ -77,7 +77,7 @@ static ssize_t execBasicQuery(int bus, uint8_t addr, uint8_t cmd,
         return -errno;
     }

-    resp.reserve(UINT8_MAX + 1);
+    resp.resize(UINT8_MAX + 1);

     /* Issue the NVMe MI basic command */
     size = i2c_smbus_read_block_data(fileHandle.handle(), cmd, resp.data());
@@ -96,6 +96,8 @@ static ssize_t execBasicQuery(int bus, uint8_t addr, uint8_t cmd,
         return -EBADMSG;
     }

+    resp.resize(size);
+
     return size;
 }
operezmuena commented 2 years ago

@amboar I just tested your patch and it does solve my issues. Thank you for the prompt response.

amboar commented 2 years ago

@operezmuena Do you mind marking it as verified here: https://gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/52531

amboar commented 2 years ago

Actually, this was broken by 73030639a5ba8 at https://github.com/openbmc/dbus-sensors/commit/73030639a5ba8#diff-5abbed76e66786eec2a94b186f608b090c253722a2b37635b278e0740e3a5e4aL185-L187 where we switch from accessing the backing buffer directly to using an iterator over the vector.