microsoft / Network-Adapter-Class-Extension

Network Adapter Class Extension to WDF (NetAdapter Cx) makes it easy to write high quality and high speed drivers for Network Interface Controllers
MIT License
54 stars 17 forks source link

Wrong NUMA node initialization causes failure on devices attached to non-0 NUMA node #19

Closed dimaruinskiy-intel closed 1 year ago

dimaruinskiy-intel commented 1 year ago

The problem is with the code here: https://github.com/microsoft/Network-Adapter-Class-Extension/blob/de9490339b44f888aa2b302dc971f76789876950/netcx/adapter/nxadapter.cpp#L3081-L3083

  1. preferredNumaNode is a ULONG initialized to MM_ANY_NODE_OK (0x80000000).
  2. IoGetDeviceNumaNode gets a pointer to the address of preferredNumaNode as USHORT
  3. IoGetDeviceNumaNode overwrites just the bottom word with the actual NUMA node (for example 1).
  4. Now preferredNumaNode has an invalid value of 0x80000001, which is stored in the Tx/Rx memory constraints structures.
  5. Down the road it causes a failure in: https://github.com/microsoft/Network-Adapter-Class-Extension/blob/de9490339b44f888aa2b302dc971f76789876950/netcx/bm/dmaallocator.cpp#L62

Confirmed by looking at NetAdapterCx traces: _BufferManager::AllocateBufferVector - ERROR: Returning STATUS_INSUFFICIENT_RESOURCES. (bufferVector == nullptr is true). NetClientCreateBufferPool - [status=0xc000009a(STATUS_INSUFFICIENT_RESOURCES)] NxTxXlat::Create - [status=0xc000009a(STATUS_INSUFFICIENT_RESOURCES)] NxTxXlat::Create - [status=0xc000009a(STATUS_INSUFFICIENT_RESOURCES)] QueueControl::CreateQueues - [status=0xc000009a(STATUS_INSUFFICIENT_RESOURCES)] NxTranslationApp::CreateDatapath - [status=0xc000009a(STATUS_INSUFFICIENT_RESOURCES)] NetClientAdapterSetDeviceFailed - Translator initiated WdfDeviceSetFailed: 0xc000009a(STATUS_INSUFFICIENTRESOURCES)

Note: this bug will only be visible if the I/O device NUMA node is non-zero. If its zero, the value will stay as MM_ANY_NODE_OK, which does not cause problems with the internal memory allocator routines.

amrutha-chandramohan commented 1 year ago

Thanks for reporting this issue - the fix will be in windows insider builds version 26003+