intel / ipmctl

BSD 3-Clause "New" or "Revised" License
182 stars 62 forks source link

[NVM API] nvm_get_regions() doesn't return the correct region 'freecapacity' when namespaces exist in the region #188

Open sscargal opened 2 years ago

sscargal commented 2 years ago

This issue was originally reported on the #pmem Slack channel by Tom Nabarro:

currently seeing a difference in region free_capacity when using the C API nvm_get_regions() (major version 2) compared to the CLI tool (same major version), any ideas: @wolf-157:~/projects/daos> sudo ipmctl show -region
SocketID | ISetID            | PersistentMemoryType | Capacity    | FreeCapacity | HealthState
==================================================================================================
0x0000  | 0x1d427f4835f32ccc | AppDirect           | 3012.000 GiB | 0.000 GiB   | Healthy
0x0001  | 0xef5a7f48cef32ccc | AppDirect           | 3012.000 GiB | 0.000 GiB   | Healthy
But
DEBUG 14:32:44.167888 ipmctl.go:323: discovered pmem regions: [{IsetId:2108387523682315468 Type:1 Capacity:3234110373888 Free_capacity:3234110373888 Socket_id:0 Dimm_count:6 Dimms:[1 17 33 257 273 289 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] Health:1 Reserved:[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]} {IsetId:17247237673655151820 Type:1 Capacity:3234110373888 Free_capacity:3234110373888 Socket_id:1 Dimm_count:6 Dimms:[4097 4113 4129 4353 4369 4385 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] Health:1 Reserved:[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]}]

I responded with:

We can see from Region.c#L1165 that the NamespaceCapacityUsed is subtracted from the pRegion->Size to get the pFreeCapacity value *pFreeCapacity = pRegion->Size - NamespaceCapacityUsed; It would be best to open an issue on https://github.com/intel/ipmctl/issues to get feedback from the developers, but it looks like the call to nvm_get_regions() isn't subtracting the namespace capacity.

After investigating further, the following C++ code reproduces Tom's issue:

#include <iostream>
#include <nvm_management.h>

using std::cout;
using std::endl;

/*
 * Build using:
 *   $ g++ -o ipmctlAPItest -lipmctl ipmctlAPItest.cpp
 */

int main()
{
  NVM_UINT8 num_of_regions;
  int retval;

  // Get the number of regions
  if (( retval = nvm_get_number_of_regions(&num_of_regions)) != NVM_SUCCESS)
  {
    cout << "Error: nvm_get_number_of_regions returned " << retval << endl;
  } else {
    cout << "Number of regions: " << unsigned(num_of_regions) << endl;
  }

  // Get the region information
  region *p_region = (region *)malloc(sizeof(region)*num_of_regions);
  if (p_region == NULL) {
    cout << "Error: Cannot allocate memory." << endl;
    return EXIT_FAILURE;
  };
  if (( retval = nvm_get_regions(p_region, &num_of_regions)) != NVM_SUCCESS) {
    cout << "Error: nvm_get_regions returned " << unsigned(retval) << endl;
  }

  // Iterate over the region structs
  region *p_region_itter = p_region;
  for (int i=0; i<num_of_regions; i++) {
    cout << "p_region[" << i << "]->isetId: " << p_region_itter->isetId << endl
         << "p_region[" << i << "]->capacity: " << p_region_itter->capacity << endl
         << "p_region[" << i << "]->free_capacity: " <<  p_region_itter->free_capacity << endl;
    p_region_itter++;
  }

  // Free the memory
  free(p_region);

  // Exit
  return EXIT_SUCCESS;
}

Returns:

# ./ipmctlAPItest
Number of regions: 2
p_region[0]->isetId: 3259620181632232652
p_region[0]->capacity: 1623497637888
p_region[0]->free_capacity: 1623497637888
p_region[1]->isetId: 15940630830656007372
p_region[1]->capacity: 1623497637888
p_region[1]->free_capacity: 1623497637888

One would expect the free_capacity to have been calculated and should be zero (0) on my system as demonstrated by ipmctl show -region:

# ipmctl show -region
 SocketID | ISetID             | PersistentMemoryType | Capacity     | FreeCapacity | HealthState
==================================================================================================
 0x0000   | 0x2d3c7f48f4e22ccc | AppDirect            | 1512.000 GiB | 0.000 GiB    | Healthy
 0x0001   | 0xdd387f488ce42ccc | AppDirect            | 1512.000 GiB | 0.000 GiB    | Healthy

nvm_get_regions() calls nvm_get_regions_ex() which populates the region struct with the results from gNvmDimmDriverNvmDimmConfig.GetRegions()

From nvm_management.c:

NVM_API int nvm_get_regions_ex(const NVM_BOOL use_nfit, struct region *p_regions, NVM_UINT8 *count)
{
  COMMAND_STATUS *pCommandStatus = NULL;
  NVM_UINT8 RegionCount, Index, DimmIndex;
  REGION_INFO *pRegions = NULL;

  ...

  erc = gNvmDimmDriverNvmDimmConfig.GetRegions(&gNvmDimmDriverNvmDimmConfig, RegionCount, use_nfit, pRegions, pCommandStatus);

  ...

  for (Index = 0; Index < RegionCount; Index++) {
    memset(&p_regions[Index], 0, sizeof(struct region));
    p_regions[Index].socket_id = pRegions[Index].SocketId;
    p_regions[Index].isetId = pRegions[Index].CookieId;
    p_regions[Index].capacity = pRegions[Index].Capacity;
    p_regions[Index].free_capacity = pRegions[Index].FreeCapacity;
    p_regions[Index].health = pRegions[Index].Health;
    p_regions[Index].type = pRegions[Index].RegionType;
    p_regions[Index].dimm_count = pRegions[Index].DimmIdCount;

    for (DimmIndex = 0; DimmIndex < pRegions[Index].DimmIdCount; DimmIndex++)
      p_regions[Index].dimms[DimmIndex] = pRegions[Index].DimmId[DimmIndex];
  }

  ...
}  

Where

/* Region Information provides details about a PMEM region (interleave set).*/
typedef struct _REGION_INFO {
  UINT16 RegionId;                  ///< Region identifier
  UINT16 SocketId;                  ///< Socket identifier
  UINT8 RegionType;                 ///< Region type
  UINT64 Capacity;                  ///< Region total raw capacity
  UINT64 FreeCapacity;              ///< Region total free capacity. Raw less capacity used by namespaces
  UINT64 AppDirNamespaceMaxSize;    ///< Maximum size of an AppDirect namespace
  UINT64 AppDirNamespaceMinSize;    ///< Minimum size of an AppDirect namespace
  UINT16 Health;                    ///< Health state of region
  UINT16 DimmId[12];                ///< PMem module IDs associated with this region
  UINT16 DimmIdCount;               ///< Number of PMem modules found in DimmId
  UINT64 CookieId;                  ///< Interleave set ID
  HII_POINTER PtrInterlaveFormats;  ///< Pointer to array of Interleave Formats
  UINT32 InterleaveFormatsNum;      ///< Number of Interleave Formats
} REGION_INFO;

Note, the comment for FreeCapacity says "Region total free capacity. Raw less capacity used by namespaces", so this isn't working as documented.

StevenPontsler commented 2 years ago

We will take a look at it.

tanabarr commented 2 years ago

Have reverted to retrieving free capacity by scraping ipmctl cli output (actually because nvm_get_regions_ex(use_nfit=false, ...) takes a minute to return) but it would be preferable to be able to use libipmctl api. I found that setting use_nfit=false and calling nvm_uninit() between calls results in the FreeCapacity stat being updated as expected. Unfortunately the latency in the call is not acceptable hence the reason for the switch.