OpenChannelSSD / qemu-nvme

The LightNVM qemu implementation, based on NVMe
http://openchannelssd.readthedocs.org/en/latest/
Other
131 stars 67 forks source link

Latest patches trigger a null-ptr deref on nvme bringup #26

Closed javigon closed 6 years ago

javigon commented 6 years ago

The latest patches on ocssd2 break on 4.17 with 2.0 LightNVM core support.

The error is between fa0ea0736a6a small fixes d24c46406674 More cleanup and fix up 2.0 support Cannot bisect since fa0ea0736a6a fails compilation too.

Seems to have something to do with the 2.0 geometry identification in the kernel: 6b304ba63148 lightnvm: add 2.0 geometry identification

Though, without the new qemu patches, 4.17/core with 2.0 support (also pblk) works fine.

Also, there was a necessary patch fixing a bad dword formatting on get log page that has been removed (probably by force push)

lnvm: fix get log dword formatting

hw/block/nvme.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 53ca966cdd33..427211e24c98 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1900,8 +1900,8 @@ static uint16_t nvme_get_log(NvmeCtrl n, NvmeNamespace ns, NvmeCmd cmd) / NVMe R1.3 */ numdl = (dw10 >> 16) << 2; numdu = (dw11 & 0xffff) << 2;

MatiasBjorling commented 6 years ago

I may have branched off an earlier version, it was not my intention.

So looking at this code... I'm confused, the more I look the implementation, it might be broken with the upper bit parts.

Ok if I commit the following instead?

diff --git i/hw/block/nvme.c w/hw/block/nvme.c
index 53ca966cdd..dc6fe47ef7 100644
--- i/hw/block/nvme.c
+++ w/hw/block/nvme.c
@@ -1866,7 +1866,7 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len)
 }

 static uint16_t lnvm_report_chunk(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
-                                  uint32_t buf_len, uint32_t off)
+                                  uint32_t buf_len, uint64_t off)
 {
     LnvmCtrl *ln = &n->lnvm_ctrl;
     uint8_t *log_page;
@@ -1895,16 +1895,17 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd)
     uint32_t dw12 = le32_to_cpu(cmd->cdw12);
     uint32_t dw13 = le32_to_cpu(cmd->cdw13);
     uint16_t lid = dw10 & 0xff;
-    uint32_t numdl, numdu, lpol, lpou, len, off;
+    uint32_t numdl, numdu, len;
+    uint64_t off, lpol, lpou;

     /* NVMe R1.3 */
-    numdl = (dw10 >> 16) << 2;
-    numdu = (dw11 & 0xffff) << 2;
-    lpol = dw12 << 2;
-    lpou = dw13 << 2;
+    numdl = (dw10 >> 16);
+    numdu = (dw11 & 0xffff);
+    lpol = dw12;
+    lpou = dw13;

-    len = numdl + numdu;
-    off = lpol + lpou;
+    len = ((numdu << 16) | numdl) << 2;
+    off = (lpou << 32ULL) | lpol;

     switch (lid) {
     case NVME_LOG_ERROR_INFO:
javigon commented 6 years ago

No problem, just wanted to mention that without proper dword shifting, we will be returning bad metadata.

I think it looks good for get log, yes.

However, I'm not sure it will fix the nvme error on boot up.