opiproject / opi-prov-life

Provisioning, Lifecycle and Platform Management Group
Apache License 2.0
19 stars 26 forks source link

mandatory smbios tables and fields #178

Open glimchb opened 1 year ago

glimchb commented 1 year ago

https://www.dmtf.org/standards/smbios

Marvell supports the following tables in smbios (copy from slack):

BIOS Information (Type 0)
System Information (Type 1)
Baseboard Information (Type 2)
System Enclosure (Type 3)
Processor Information (Type 4)
Cache Information (Type 7)
Port Connector Information (Type 8)
System Slots (Type 9)
OEM Strings (Type 11)
System Configuration Options (Type 12)
Physical Memory Array (Type 16)
Memory Device (Type 17)
Memory Array Mapped Address (Type 19)
Memory Device Mapped Address (Type 20)
System Reset (Type 23)
System Boot Information (Type 32)

Let's gather feedback from Nvidia and Intel (we know AMD doesn't support it yet) and start building mandatory fields

we can use https://github.com/opiproject/smbios-validation-tool to see if we can create compliance tool

ballle98 commented 1 year ago

@glimchb looking a modifying the validation tool do we want to rename it? like opi-smbios-validation tool. Do we want to enable remote execution to so we don't have to install python on the DPU?

glimchb commented 1 year ago

@ballle98 I don't think we should rename, it is already under opiproject org, so we know where it belongs python on the DPU is expected to run - it is pretty basic

glimchb commented 1 year ago

example output from validation tool:

$ sudo python3 ./smbios_validation
*********************************************************************
SMBIOS LESS Compliance Validation Tool

Copyright 2019 Google LLC
Licensed under the Apache License, Version 2.0 (the "License")
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*********************************************************************

Handle ID: 0x0000
ERROR: Invalid Vendor field in Type 0 (BIOS Information) record.
ACTION: BIOS Vendor string should contain "Google".
Without that our software will ignore the OEM structures.
ERROR: Invalid Release Date field in Type 0 (BIOS Information) record.
ACTION: Please populate BIOS Release Date field with correct date (format is MM/DD/YYYY).

Handle ID: 0x0002
ERROR: Invalid Chassis Handle in Type 2 (Board Information) record.
ACTION: Please populate Chassis Handle field with valid handle.

Handle ID: 0x0003
ERROR: Invalid Type field in Type 3 (Chassis) record.
ACTION: Please populate Type field with valid string.
Valid Type(s): Main Server Chassis, Rack Mount Chassis
ERROR: Invalid OEM Information field in Type 3 (Chassis) record.
ACTION: Please populate OEM Information field with valid hex value.
OEM Byte 0 must be 0x67, which is the identification for Google OEM Info.

Handle ID: 0x0004
ERROR: Invalid Socket Designation field in Type 4 (Processor Information) record.
ACTION: Please populate Socket Designation field with valid string.
Processor silkscreen tag usually looks like CPU0, CPU1, etc.
ERROR: Invalid Thread Count field in Type 4 (Processor Information) record.
ACTION: Please populate Thread Count field with valid number.

Handle ID: 0x000E
ERROR: Invalid Array Handle field in Type 17 (Memory Device) record.
ACTION: Please populate Array Handle field with valid handle.
This field should be the handle associated with the Physical Memory Array to which this device belongs.
ERROR: Invalid Form Factor field in Type 17 (Memory Device) record.
ACTION: Please populate Form Factor field with valid string.
Valid Form Factor(s): Unknown, DIMM
ERROR: Invalid Bank Locator field in Type 17 (Memory Device) record.
ACTION: Please populate Bank Locator field with valid string.
This is the string that identifies the physically labeled bank where the memory device is located.
ERROR: Invalid Memory Type field in Type 17 (Memory Device) record.
ACTION: Please populate Memory Type field with valid string.
Valid Type(s): Unknown, DDR4, LPDDR4, DDR5, Logical non-volatile device

**The SMBIOS implementation in this host lacks LESS compliance.**
ballle98 commented 1 year ago

@glimchb got it. Started working on https://github.com/opiproject/smbios-validation-tool/blob/master/smbios_validation_tool/rules.py removing the google specific parts. Google is very focused on DIMM inventory which is not so important to OPI.

ballle98 commented 1 year ago

@glimchb can you add issues section to https://github.com/opiproject/smbios-validation-tool ?

seroyer commented 1 year ago

@glimchb can you add issues section to https://github.com/opiproject/smbios-validation-tool ?

Done

ballle98 commented 1 year ago

@glimchb created pull request for smbios-validation-tool

glimchb commented 1 year ago

thanks, looking https://github.com/opiproject/smbios-validation-tool/pull/2

samerhaj commented 1 year ago

Feedback from Arm: Arm worked with our partners and ecosystem in defining standard requirements for firmware interfaces to support OSes and Hypervisors for servers (and other segments). The SMBIOS requirements for Arm-based servers are defined in the BBR 2.0 specification (https://developer.arm.com/documentation/den0044/latest), chapter 9. Summary is below:

BIOS Information (Type 0)    // required
System Information (Type 1)  // required
Baseboard Information (Type 2) //recommended
System Enclosure (Type 3)  // required
Processor Information (Type 4)  // required
Cache Information (Type 7)  // required
Port Connector Information (Type 8)  // recommended for platforms with physical ports
System Slots (Type 9)  // conditionally required for platforms with expansion slots
OEM Strings (Type 11)  // recommended
System Configuration Options (Type 12)  // not required
Physical Memory Array (Type 16) // required
Memory Device (Type 17)  // required
Memory Array Mapped Address (Type 19)  // required
Memory Device Mapped Address (Type 20)  // not required
System Reset (Type 23)  // not required
System Boot Information (Type 32)  // required
System Power Supplies (Type 39) // recommended for servers
Onboard Devices Extended Information (Type 41) // recommended 
Redfish Host Interface (Type 42) // required for platforms supporting Redfish Host Interface
TPM Device (Type 43) // required for platforms with a TPM
Firmware Inventory Information (Type 45) // recommended
String Property (Type 46) // recommended
samerhaj commented 1 year ago

As for the validation tool, have we considered FirmwareTestSuite (FWTS) https://wiki.ubuntu.com/FirmwareTestSuite ? It has a complete and up-to-date SMBIOS validator that is already being used by the industry (including integrated in Arm SystemReady validation suites): https://git.launchpad.net/fwts/tree/src/dmi/dmicheck

glimchb commented 1 year ago

@samerhaj great info, thanks, reading...

samerhaj commented 1 year ago

@glimchb Please let me know if you have any questions.

ballle98 commented 1 year ago

@samerhaj not sure if our group is more adapt at modifying C or python. I see 4 checks in dmicheck. Not sure how this compares to smbios-validation-tool

samerhaj commented 1 year ago

@ballle98 why not adopt both?

Some advantages of dmicheck:

glimchb commented 1 year ago

@samerhaj can you try running dmicheck on one of the cards you have to see if it works as expected ? I'm fine with either approach as long as we have a validation tool to verify compliance...

samerhaj commented 1 year ago

Works like charm. Tested on NVIDIA and Marvell xPUs.

FWTs itself is available on Ubuntu for example: sudo apt-get install fwts

Here is a one sample output:

ubuntu@localhost:~$ sudo fwts -r stdout -q dmicheck

Results generated by fwts: Version V20.03.00 (2020-03-23 18:42:42).

Some of this work - Copyright (c) 1999 - 2020, Intel Corp. All rights reserved.
Some of this work - Copyright (c) 2010 - 2020, Canonical.
Some of this work - Copyright (c) 2016 - 2020, IBM.
Some of this work - Copyright (c) 2017 - 2020, ARM Ltd.

This test run on 28/03/23 at 22:10:28 on host Linux localhost
5.4.0-1049-bluefield #55-Ubuntu SMP PREEMPT Mon Oct 17 20:09:22 UTC 2022
aarch64.

Command: "fwts -r stdout -q dmicheck".
Running tests: dmicheck.

dmicheck: DMI/SMBIOS table tests.
--------------------------------------------------------------------------------
Test 1 of 4: Find and test SMBIOS Table Entry Points.
This test tries to find and sanity check the SMBIOS data structures.
Cannot mmap SMBIOS entry at 0x0xf58f0000
SMBIOS30 entry loaded from /sys/firmware/dmi/tables/smbios_entry_point
PASSED: Test 1, Found SMBIOS30 Table Entry Point at 0xf58d0000
SMBIOS30 Entry Point Structure:
  Anchor String          : _SM3_
  Checksum               : 0x7a
  Entry Point Length     : 0x18
  Major Version          : 0x03
  Minor Version          : 0x01
  Docrev                 : 0x01
  Entry Point Revision   : 0x01
  Reserved               : 0x00
  Table maximum size     : 0x00000452
  Table address          : 0x00000000f58c0000

PASSED: Test 1, SMBIOS30 Table Entry Point Checksum is valid.
PASSED: Test 1, SMBIOS30 Table Entry Point Length is valid.
SMBIOS30 table loaded from /sys/firmware/dmi/tables/DMI
PASSED: Test 1, SMBIOS 3.0 Table Entry Structure Table Address and Length looks
valid.

Test 2 of 4: Test DMI/SMBIOS tables for errors.
SKIPPED: Test 2, Cannot find SMBIOS or DMI table entry, skip the test.

Test 3 of 4: Test DMI/SMBIOS3 tables for errors.
SMBIOS30 entry loaded from /sys/firmware/dmi/tables/smbios_entry_point
SMBIOS30 table loaded from /sys/firmware/dmi/tables/DMI
PASSED: Test 3, Entry @ 0xf58c0000 'BIOS Information (Type 0)'
PASSED: Test 3, Entry @ 0xf58c005b 'System Information (Type 1)'
PASSED: Test 3, Entry @ 0xf58c00cd 'Base Board Information (Type 2)'
PASSED: Test 3, Entry @ 0xf58c014b 'Chassis Information (Type 3)'
PASSED: Test 3, Entry @ 0xf58c01cf 'Processor Information (Type 4)'
PASSED: Test 3, Entry @ 0xf58c028e 'Cache Information (Type 7)'
PASSED: Test 3, Entry @ 0xf58c02b4 'Cache Information (Type 7)'
PASSED: Test 3, Entry @ 0xf58c02da 'Cache Information (Type 7)'
PASSED: Test 3, Entry @ 0xf58c0300 'Cache Information (Type 7)'
PASSED: Test 3, Entry @ 0xf58c0326 'Port Connector Information (Type 8)'
PASSED: Test 3, Entry @ 0xf58c033d 'Port Connector Information (Type 8)'
PASSED: Test 3, Entry @ 0xf58c034c 'OEM Strings (Type 11)'
PASSED: Test 3, Entry @ 0xf58c0361 'System Configuration Options (Type 12)'
PASSED: Test 3, Entry @ 0xf58c0384 'Physical Memory Array (Type 16)'
PASSED: Test 3, Entry @ 0xf58c039d 'Memory Device (Type 17)'
PASSED: Test 3, Entry @ 0xf58c03e2 'System Boot Information (Type 32)'
PASSED: Test 3, Entry @ 0xf58c03f1 'System Boot Information (Type 32)'
PASSED: Test 3, Entry @ 0xf58c03fe 'Unknown (Type 233)'
PASSED: Test 3, Entry @ 0xf58c040a 'Memory Array Mapped Address (Type 19)'
PASSED: Test 3, Entry @ 0xf58c042b 'Memory Array Mapped Address (Type 19)'

Test 4 of 4: Test ARM SBBR SMBIOS structure requirements.

================================================================================
24 passed, 0 failed, 0 warning, 0 aborted, 1 skipped, 0 info only.
================================================================================

24 passed, 0 failed, 0 warning, 0 aborted, 1 skipped, 0 info only.

Test Failure Summary
================================================================================

Critical failures: NONE

High failures: NONE

Medium failures: NONE

Low failures: NONE

Other failures: NONE

Test           |Pass |Fail |Abort|Warn |Skip |Info |
---------------+-----+-----+-----+-----+-----+-----+
dmicheck       |   24|     |     |     |    1|     |
---------------+-----+-----+-----+-----+-----+-----+
Total:         |   24|    0|    0|    0|    1|    0|
---------------+-----+-----+-----+-----+-----+-----+
glimchb commented 1 year ago

@samerhaj so this dmicheck is not enough... see

$ sudo dmidecode -t 1
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.1.1 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: https://www.mellanox.com
        Product Name: BlueField SoC
        Version: 1.0.0
        Serial Number: Unspecified System Serial Number
        UUID: 2e3bc1d1-e205-4830-a817-968ed1978bac
        Wake-up Type: Power Switch
        SKU Number: Unspecified System SKU
        Family: BlueField

we want to raise alert on Unspecified System Serial Number for example

samerhaj commented 1 year ago

@glimchb This is already done in fwts. The code tests for extensive list of patterns of "bad strings" that were found on implementations across the industry: https://git.launchpad.net/fwts/tree/src/dmi/dmicheck/dmicheck.c#n109

You need to make sure to run sudo fwts -r stdout -q dmicheck , which is maintained and updated, and not the old dmicheck

If you see anything missing from that list, we can easily suggest it or submit a patch upstream, and the maintainers are quick to review and act on it.

On this particular system, the implementation is correct. That's why you do not see failures:

ubuntu@localhost:~$ sudo dmidecode -t 1
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.1.1 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: https://www.mellanox.com
        Product Name: BlueField SoC
        Version: 1.0.0
        Serial Number: MT2230X21044.
        UUID: d608faef-c310-ed11-8000-e8ebd3e9dbee
        Wake-up Type: Power Switch
        SKU Number: MBF2H332A-AECOT.
        Family: BlueField
samerhaj commented 1 year ago

@glimchb Here are examples of failures (from other systems), showing both the verbose output and the summary at the end:

Test 3 of 4: Test DMI/SMBIOS3 tables for errors.
SMBIOS3 entry loaded from /sys/firmware/dmi/tables/smbios_entry_point 
SMBIOS30 table loaded from /sys/firmware/dmi/tables/DMI
PASSED: Test 3, Entry @ 0xae7e0000 'BIOS Information (Type 0)'
FAILED [MEDIUM] DMISerialNumber: Test 3, String index 0x04 in table entry
'System Information (Type 1)' @ 0xae7e0048, field 'Serial Number', offset 0x07
has a default value '000000000' and probably has not been updated by the BIOS
vendor.

ADVICE: The DMI table contains data which is clearly been left in a default
setting and not been configured for this machine. Somebody has probably
forgotten to define this field and it basically means this field is effectively
useless. Note that the kernel uses this field so it probably should be corrected
to ensure the kernel is using sane values.

ADVICE: It may be worth checking against section 7.2 of the System Management
BIOS (SMBIOS) Reference Specification (see http://www.dmtf.org/standards
/smbios).

PASSED: Test 3, Entry @ 0xae7e0093 'Base Board Information (Type 2)'
FAILED [LOW] DMISerialNumber: Test 3, String index 0x05 in table entry 'Chassis
Information (Type 3)' @ 0xae7e00e0, field 'SKU Number', offset 0x15 has a
default value 'Unknown' and probably has not been updated by the BIOS vendor.

ADVICE: The DMI table contains data which is clearly been left in a default
setting and not been configured for this machine. Somebody has probably
forgotten to define this field and it basically means this field is effectively
useless, however the kernel does not use this data so the issue is fairly low.

ADVICE: It may be worth checking against section 7.4 of the System Management
BIOS (SMBIOS) Reference Specification (see http://www.dmtf.org/standards
/smbios).

Test Failure Summary
================================================================================

Critical failures: NONE

...
Medium failures: 1
 dmicheck: String index 0x04 in table entry 'System Information (Type 1)' @ 0xae7e0048, field 'Serial Number', offset 0x07 has a default value '000000000' and probably has not been updated by the BIOS vendor.

Low failures: 1
 dmicheck: String index 0x05 in table entry 'Chassis Information (Type 3)' @ 0xae7e00e0, field 'SKU Number', offset 0x15 has a default value 'Unknown' and probably has not been updated by the BIOS vendor.

Other failures: NONE
glimchb commented 1 year ago

@samerhaj we need to add then to upstream illegal serial number Serial Number: Unspecified System Serial Number

samerhaj commented 1 year ago

@glimchb done: https://bugs.launchpad.net/fwts/+bug/2013208

samerhaj commented 1 year ago

patch already sent, and should be fixed in next FWTS release: https://lists.ubuntu.com/archives/fwts-devel/2023-March/013614.html

daniileg commented 1 year ago

I compared the SMBIOS tables coverage in both https://github.com/opiproject/smbios-validation-tool and https://git.launchpad.net/fwts/tree/src/dmi/dmicheck so we can compare both tools more closely. The spreadsheet can be downloaded from https://git.launchpad.net/smbiosverification/tree .

samerhaj commented 1 year ago

A related discussin is the SMBIOS requirements for Arm based DPUs/IPUs, which is based on the Arm BBR specification (and the SystemReady certification). Related proposal is here: https://github.com/opiproject/opi-prov-life/blob/main/DpuSystemReady.md

ballle98 commented 1 year ago

@samerhaj Is this what you are refering to? https://github.com/ARM-software/arm-enterprise-acs/blob/8b168ecc35c48488928a6e1f42b8bfac6ea2b6e4/sbbr/docs/testcase-checklist.md?plain=1#L177

| SMBIOS | Type00: BIOS Information (REQUIRED) | FWTS | 5.2.1 |
| SMBIOS | Type01: System Information (REQUIRED) | FWTS | 5.2.2 |
| SMBIOS | Type02: Baseboard (or Module) Information (RECOMMENDED) | FWTS | 5.2.3 |
| SMBIOS | Type03: System Enclosure or Chassis (REQUIRED) | FWTS | 5.2.4 |
| SMBIOS | Type07: Cache Information (REQUIRED) | FWTS | 5.2.6 |
| SMBIOS | Type08: Port Connector Information (RECOMMENDED for platforms with physical ports) | FWTS | 5.2.7 |
| SMBIOS | Type09: System Slots (REQUIRED for platforms with expansion slots) | FWTS | 5.2.8 |
| SMBIOS | Type11: OEM Strings (RECOMMENDED) | FWTS | 5.2.9 |
| SMBIOS | Type13: BIOS Language Information (RECOMMENDED) | FWTS | 5.2.10 |
| SMBIOS | Type15: System Event Log (RECOMMENDED) | FWTS | 5.2.11 |
| SMBIOS | Type16: Physical Memory Array (REQUIRED) | FWTS | 5.2.12 |
| SMBIOS | Type17: Memory Device (REQUIRED) | FWTS | 5.2.13 |
| SMBIOS | Type19: Memory Array Mapped Address (REQUIRED) | FWTS | 5.2.14 |
| SMBIOS | Type32: System Boot Information (REQUIRED) | FWTS | 5.2.15 |
| SMBIOS | Type38: IPMI Device Information (REQUIRED for platforms with IPMI BMC Host Interface) | FWTS | 5.2.16 |
| SMBIOS | Type41: Onboard Devices Extended Information (RECOMMENDED) | FWTS | 5.2.17 |
| SMBIOS | Redfish Host interface support (RECOMMENDED) | FWTS | 5.2.18 |
samerhaj commented 1 year ago

@ballle98 Essentially yes, although this is a highlevel checklist of the test-cases. The actual requirements are in this spec: https://developer.arm.com/documentation/den0044/latest (section 9). Arm is open to feedback on how those can be enhaned to address any gaps in OPI specific requirements.