ansible-collections / community.general

Ansible Community General Collection
https://galaxy.ansible.com/ui/repo/published/community/general/
GNU General Public License v3.0
836 stars 1.53k forks source link

redfish_command: UEFI one time boot is set but module errors with HTTP 400 on Dell 13th gen #5091

Open bluikko opened 2 years ago

bluikko commented 2 years ago

Summary

Working with Dell 13th gen servers such as R730, the redfish_command module cannot set one time boot setting correctly in UEFI mode. In BIOS mode everything works fine but in UEFI mode there are 2 issues:

  1. The module always fails with HTTP error 400: Unable to set the attribute value because it depends on other attribute(s). but the one time boot setting is indeed set, the "Next Boot" setting in iDRAC console is indeed changed and the server does follow the commanded one time boot on the next server boot.
  2. Setting one time boot seems to be run as a BIOS Configuration Job in the Lifecycle Controller, contrary to BIOS mode where the change does not cause a BIOS Configuration Job to run. I do not know if this is fixable or if it is a "feature" of the UEFI mode on these Dells. When setting one time boot in BIOS mode the console will just show on the next boot boot for example IPMI: boot to PXE requested and continue to boot from PXE; in UEFI mode the boot will show Lifecycle Controller: System Configuration Requested and will continue to run a BIOS Configuration job similarly to changing any BIOS setting and then after completion of the configuration job it will reboot once again and display IPMI: Boot to configured UEFI device path. This is much slower.
    • Why is there a BIOS configuration job running if the "Next Boot" setting can be seen in iDRAC to change immediately after Ansible task is run? The one time boot setting seems to take effect immediately and not require a BIOS configuration job so what is this configuration job actually doing?
    • If one time boot is set manually in iDRAC then the server will boot similarly to BIOS mode: without a BIOS configuration job and simply printing IPMI: boot to PXE requested during the bootup and continue to do a PXE boot.

Additionally, the example in documentation for module redfish_command using UEFI one time boot is not very clear, currently having parameter:

    uefi_target: "/0x31/0x33/0x01/0x01"

Where does this hex string come from? What does it mean?

Issue Type

Bug Report

Component Name

redfish_command

Ansible Version

$ ansible --version
ansible [core 2.11.12]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/x/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /opt/ansible/lib64/python3.6/site-packages/ansible
  ansible collection location = /home/x/.ansible/collections:/usr/share/ansible/collections
  executable location = /opt/ansible/bin/ansible
  python version = 3.6.8 (default, Nov 16 2020, 16:55:22) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
  jinja version = 3.0.2
  libyaml = True

Community.general Version

$ ansible-galaxy collection list community.general
# /usr/share/ansible/collections/ansible_collections
Collection        Version
----------------- -------
community.general 5.0.1

Configuration

$ ansible-config dump --only-changed
I do not think this is relevant.

OS / Environment

EL7, Dell R730xd BIOS 2.13.0, Dell Lifecycle Controller 2.81.81.81

Steps to Reproduce

- name: Configure one time boot
  community.general.redfish_command:
    baseuri: "{{ redfish_uri }}"
    category: Systems
    command: SetOneTimeBoot
    bootdevice: UefiTarget
    uefi_target: MAC(010203040506)
    username: "{{ redfish_user }}"
    password: "{{ redfish_password }}"
    timeout: "{{ redfish_timeout }}"
  become: false
  delegate_to: localhost

Expected Results

The module should not error, especially if the boot setting is set correctly in the system.

Actual Results

fatal: [host.example.com -> localhost]: FAILED! => changed=false
  msg: 'HTTP Error 400 on PATCH request to ''https://192.0.2.1/redfish/v1/Systems/System.Embedded.1'', extended message: ''Unable to set the attribute value because it depends on other attribute(s).'''

It does not matter if boot_override_mode is included (set to UEFI) or not included.

Code of Conduct

ansibullbot commented 2 years ago

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

ansibullbot commented 2 years ago

cc @bhavya06 @mraineri @rajeevkallur @renxulei @tomasg2012 @xmadsen click here for bot help

bluikko commented 2 years ago

Since in iDRAC a "PXE" boot can be requested the same way in both BIOS/UEFI modes, for example in iDRAC console Next Boot -> PXE, I tested to use the same task for commanding PXE boot that I use in BIOS mode:

- name: Configure one time boot
  community.general.redfish_command:
    baseuri: "{{ redfish_uri }}"
    category: Systems
    command: SetOneTimeBoot
    bootdevice: PXE
    username: "{{ redfish_user }}"
    password: "{{ redfish_password }}"
    timeout: "{{ redfish_timeout }}"
  become: false
  delegate_to: localhost

Earlier when I tested this task on a server set to UEFI boot the server booted to Lifecycle Controller where user can do various management tasks - it did not boot to PXE. But now when I tested it again it did correctly do a PXE boot...

So it seems the problem is happening only with bootdevice: UefiTarget which I do not seem to then need in the end and will just use a bootdevice: PXE same as in BIOS mode.
But anyways it looks like something is wrong with bootdevice: UefiTarget with the 400 HTTP error.

mraineri commented 2 years ago

Are you able to add boot_override_mode: UEFI to your request? If the boot override mode is currently configured as Legacy on the system, then that would explain why it's rejecting your request; it won't allow you to specify a UEFI-only type of boot if it's in legacy mode.

However, the fact that iDRAC returned a 400 and it did modify some things internally is absolutely a bug on the iDRAC-side of things; any error response to a modification request is expected to leave the service unmodified.

The example provided for UEFI target is a poor example and that should be fixed. The value is supposed to be a representation of a UEFI device path as a string. There is documentation about this structure from the UEFI community here (with examples): https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-guide/3_foundation/readme.9

The thing to keep in mind with using UEFI target is it's to allow a user to specify any device enumerated in UEFI as a boot target. Using other options like PXE or USB will defer decisions to BIOS to "pick what makes the most sense" for the boot device given the specified parameter.

PXE can be invoked in either legacy or UEFI boot modes, so there's no dependency like there is with specifying a UEFI target.

bluikko commented 2 years ago

Are you able to add boot_override_mode: UEFI to your request?

As listed in OP, adding boot_override_mode made no difference.

If the boot override mode is currently configured as Legacy on the system, then that would explain why it's rejecting your request; it won't allow you to specify a UEFI-only type of boot if it's in legacy mode.

The system is currently already configured to be in UEFI mode. It is booting with Secure Boot successfully.
Manually changing the "Next Boot" to PXE boots correctly the Secure Boot image and the installed system has Secure Boot enabled. I believe this indicates that the system is currently configured correctly in UEFI mode.

However, the fact that iDRAC returned a 400 and it did modify some things internally is absolutely a bug on the iDRAC-side of things; any error response to a modification request is expected to leave the service unmodified.

Perhaps if there is only a single request to the redfish server. I cannot say whether that is the case or not.
There is not possibly two or more requests being sent to the server and only one of those fails with 400?

The example provided for UEFI target is a poor example and that should be fixed. The value is supposed to be a representation of a UEFI device path as a string. There is documentation about this structure from the UEFI community here (with examples): https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-guide/3_foundation/readme.9

I think this could be perhaps referred to in the redfish_command documentation block and/or update the example target string to be more understandable.
Even more critical (IMHO) than the structure of the byte values is to understand what are the listed bytes/IDs. A competent operator can figure out the structure given a list of the byte values, I assume.

The link did not provide the IDs 0x31 or 0x33 either. 0x31 is not listed as a value in the "Device path header" so I guess it is not that one. So I assume it is a "PCI device path", but where to find what byte values to use (I assume 0x01 and 0x01 if 0x33 0x31 is the header - not sure where to find that one either) ? Earlier I even looked at the very lengthy UEFI 2.6 spec https://uefi.org/sites/default/files/resources/UEFI%20Spec%202_6.pdf and could not decode the example but maybe I just did not find the right place in the lengthy doc...
Edit: I just understood the provided link. The header referred to in the Example 4 is the Example 3. So how can the example in redfish_command be so short, is it really a valid UEFI boot target path? All the listings I saw in the UEFI spec 2.6 are much, much longer as well.

It is all very confusing to the uninitiated.
For example what is the byte ordering, does that come into play. Or are those 4 byte values actually a path separated by a slash / and not just 4 byte values...

The thing to keep in mind with using UEFI target is it's to allow a user to specify any device enumerated in UEFI as a boot target. Using other options like PXE or USB will defer decisions to BIOS to "pick what makes the most sense" for the boot device given the specified parameter.

PXE can be invoked in either legacy or UEFI boot modes, so there's no dependency like there is with specifying a UEFI target.

Yes, luckily I found this out so the bug is not a blocker for me. It could be useful if I could just boot to a target like MAC(010203040506) but I could work around it.
However looking at the link you provided, the target such as MAC(010203040506) seems to not be a valid boot target. Perhaps it is just something the server displays to the operator when doing a network boot without being a valid UEFI boot target?
It seems to be incredibly difficult (to the uninitiated) to find out how to boot from a specific NIC.

mraineri commented 2 years ago

Are you able to add boot_override_mode: UEFI to your request?

As listed in OP, adding boot_override_mode made no difference.

I missed that statement; this is definitely a bug on the iDRAC-side of things. Do you have the firmware version available? I'd like to see if this has been addressed in newer releases.

If the boot override mode is currently configured as Legacy on the system, then that would explain why it's rejecting your request; it won't allow you to specify a UEFI-only type of boot if it's in legacy mode.

The system is currently already configured to be in UEFI mode. It is booting with Secure Boot successfully. Manually changing the "Next Boot" to PXE boots correctly the Secure Boot image and the installed system has Secure Boot enabled. I believe this indicates that the system is currently configured correctly in UEFI mode.

However, the fact that iDRAC returned a 400 and it did modify some things internally is absolutely a bug on the iDRAC-side of things; any error response to a modification request is expected to leave the service unmodified.

Perhaps if there is only a single request to the redfish server. I cannot say whether that is the case or not. There is not possibly two or more requests being sent to the server and only one of those fails with 400?

There are GET operations performed to find the appropriate system, but only a single PATCH request is made to set the one-time boot target.

The example provided for UEFI target is a poor example and that should be fixed. The value is supposed to be a representation of a UEFI device path as a string. There is documentation about this structure from the UEFI community here (with examples): https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-guide/3_foundation/readme.9

I think this could be perhaps referred to in the redfish_command documentation block and/or update the example target string to be more understandable. Even more critical (IMHO) than the structure of the byte values is to understand what are the listed bytes/IDs. A competent operator can figure out the structure given a list of the byte values, I assume.

The link did not provide the IDs 0x31 or 0x33 either. 0x31 is not listed as a value in the "Device path header" so I guess it is not that one. So I assume it is a "PCI device path", but where to find what byte values to use (I assume 0x01 and 0x01 if 0x33 0x31 is the header - not sure where to find that one either) ? Earlier I even looked at the very lengthy UEFI 2.6 spec https://uefi.org/sites/default/files/resources/UEFI%20Spec%202_6.pdf and could not decode the example but maybe I just did not find the right place in the lengthy doc...

It is all very confusing to the uninitiated. For example what is the byte ordering, does that come into play. Or are those 4 byte values actually a path separated by a slash / and not just 4 byte values...

I would not expect a simple sequence of bytes to be specified like we have in our example today. I would expect something like one of the examples in the linked document like Acpi(PNP0A03,0)/Pci(1F|1)/Ata(Secondary,Master), which specifies a path from a PCIe root port to an IDE controller.

The thing to keep in mind with using UEFI target is it's to allow a user to specify any device enumerated in UEFI as a boot target. Using other options like PXE or USB will defer decisions to BIOS to "pick what makes the most sense" for the boot device given the specified parameter. PXE can be invoked in either legacy or UEFI boot modes, so there's no dependency like there is with specifying a UEFI target.

Yes, luckily I found this out so the bug is not a blocker for me. It could be useful if I could just boot to a target like MAC(010203040506) but I could work around it. However looking at the link you provided, the target such as MAC(010203040506) seems to not be a valid boot target. Perhaps it is just something the server displays to the operator when doing a network boot without being a valid UEFI boot target? It seems to be incredibly difficult (to the uninitiated) to find out how to boot from a specific NIC.

While MAC(010203040506) might be valid per the UEFI spec, the given system might not enumerate all device paths in that manner. Some UEFI implementations might support that, but others might only enumerate the device from the PCIe path style of enumeration. The unfortunate thing when working with UEFI data like this is it generally requires intimate knowledge about the system in question to get it right.

bluikko commented 2 years ago

Are you able to add boot_override_mode: UEFI to your request?

As listed in OP, adding boot_override_mode made no difference.

I missed that statement; this is definitely a bug on the iDRAC-side of things. Do you have the firmware version available? I'd like to see if this has been addressed in newer releases.

The versions:

OS / Environment EL7, Dell R730xd BIOS 2.13.0, Dell Lifecycle Controller 2.81.81.81

There is a newer LCC/DRAC version 2.83.83.83 but the changelog seems not relevant:

Fixes:

  • Fixed an issue to allow TLS protocol to be configurable on port 5900.
  • Fixed an issue on SupportAssit QR code to open the correct website.
  • Fixed an issue on "downloads.dell.com" is failure with HTTPS network after failing with HTTP network.
  • Fixed an issue on setting Max DNS IP name for IPv6 through ManualDNSEntry attribute.
  • Fixed an issue for IPV6 address should be case sensitive for Racadm/Redfish commands.
  • IPS: Keyboard interactive authentication failure with iDRAC8 using Teraterm.

If the boot override mode is currently configured as Legacy on the system, then that would explain why it's rejecting your request; it won't allow you to specify a UEFI-only type of boot if it's in legacy mode.

The system is currently already configured to be in UEFI mode. It is booting with Secure Boot successfully. Manually changing the "Next Boot" to PXE boots correctly the Secure Boot image and the installed system has Secure Boot enabled. I believe this indicates that the system is currently configured correctly in UEFI mode.

However, the fact that iDRAC returned a 400 and it did modify some things internally is absolutely a bug on the iDRAC-side of things; any error response to a modification request is expected to leave the service unmodified.

Perhaps if there is only a single request to the redfish server. I cannot say whether that is the case or not. There is not possibly two or more requests being sent to the server and only one of those fails with 400?

There are GET operations performed to find the appropriate system, but only a single PATCH request is made to set the one-time boot target.

Then it sounds like a possible DRAC bug...

The example provided for UEFI target is a poor example and that should be fixed. The value is supposed to be a representation of a UEFI device path as a string. There is documentation about this structure from the UEFI community here (with examples): https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-guide/3_foundation/readme.9

I think this could be perhaps referred to in the redfish_command documentation block and/or update the example target string to be more understandable. Even more critical (IMHO) than the structure of the byte values is to understand what are the listed bytes/IDs. A competent operator can figure out the structure given a list of the byte values, I assume. The link did not provide the IDs 0x31 or 0x33 either. 0x31 is not listed as a value in the "Device path header" so I guess it is not that one. So I assume it is a "PCI device path", but where to find what byte values to use (I assume 0x01 and 0x01 if 0x33 0x31 is the header - not sure where to find that one either) ? Earlier I even looked at the very lengthy UEFI 2.6 spec https://uefi.org/sites/default/files/resources/UEFI%20Spec%202_6.pdf and could not decode the example but maybe I just did not find the right place in the lengthy doc... It is all very confusing to the uninitiated. For example what is the byte ordering, does that come into play. Or are those 4 byte values actually a path separated by a slash / and not just 4 byte values...

I would not expect a simple sequence of bytes to be specified like we have in our example today. I would expect something like one of the examples in the linked document like Acpi(PNP0A03,0)/Pci(1F|1)/Ata(Secondary,Master), which specifies a path from a PCIe root port to an IDE controller.

This looks like the targets in the UEFI spec. However this makes it very difficult to boot from a specific NIC since instead of identifying a NIC I would need to find out PCI addresses from the system...

The thing to keep in mind with using UEFI target is it's to allow a user to specify any device enumerated in UEFI as a boot target. Using other options like PXE or USB will defer decisions to BIOS to "pick what makes the most sense" for the boot device given the specified parameter. PXE can be invoked in either legacy or UEFI boot modes, so there's no dependency like there is with specifying a UEFI target.

Yes, luckily I found this out so the bug is not a blocker for me. It could be useful if I could just boot to a target like MAC(010203040506) but I could work around it. However looking at the link you provided, the target such as MAC(010203040506) seems to not be a valid boot target. Perhaps it is just something the server displays to the operator when doing a network boot without being a valid UEFI boot target? It seems to be incredibly difficult (to the uninitiated) to find out how to boot from a specific NIC.

While MAC(010203040506) might be valid per the UEFI spec, the given system might not enumerate all device paths in that manner. Some UEFI implementations might support that, but others might only enumerate the device from the PCIe path style of enumeration. The unfortunate thing when working with UEFI data like this is it generally requires intimate knowledge about the system in question to get it right.

I could not say if it is a valid target or not but the server does list such a string when selecting manually PXE boot.
Indeed it seems to be very difficult, I am pretty sure I have spent two-digit amount of hours just trying to find out how to choose a NIC to boot from.

bluikko commented 2 years ago

There are GET operations performed to find the appropriate system, but only a single PATCH request is made to set the one-time boot target.

I just thought that maybe this could be confirmed by using curl to make a PATCH request to the DRAC.

mraineri commented 2 years ago

This should be the equivalent curl command if you want to experiment on your system:

curl -k -u <USERNAME:PASSWORD> -H "Content-Type: application/json" -X PATCH 'https://<BMCIP>/redfish/v1/Systems/System.Embedded.1' -d '{"Boot": {"BootSourceOverrideTarget": "UefiTarget", "BootSourceOverrideEnabled": "Once", "UefiTargetBootSourceOverride": "<Some UEFI target>", "BootSourceOverrideMode": "UEFI"}}'
mraineri commented 2 years ago

I'm trying a few different systems I have available to see if things are better on newer versions of firmware. Unfortunately I do not have anything that goes back as far as 2.XX.

mraineri commented 2 years ago

Well, nothing has really improved except for the fact that it returns with 200 with the same error message in the response body instead of 400, which is just going to hide the error completely. I'll pass this info along to others, but if you have support contacts, it would be worth providing this info to them as well.

bluikko commented 2 years ago

This should be the equivalent curl command if you want to experiment on your system:

curl -k -u <USERNAME:PASSWORD> -H "Content-Type: application/json" -X PATCH 'https://<BMCIP>/redfish/v1/Systems/System.Embedded.1' -d '{"Boot": {"BootSourceOverrideTarget": "UefiTarget", "BootSourceOverrideEnabled": "Once", "UefiTargetBootSourceOverride": "<Some UEFI target>", "BootSourceOverrideMode": "UEFI"}}'

Testing with this there is indeed an error - or what seems like two errors:

{"error":{"@Message.ExtendedInfo":[{"Message":"Unable to set the attribute value because it depends on other attribute(s).","MessageArgs":[],"MessageArgs@odata.count":0,"MessageId":"IDRAC.1.6.SYS444","RelatedProperties":["BootSourceOverrideEnabled"],"RelatedProperties@odata.count":1,"Resolution":"Change the value on the other attribute and retry the operation. For information about attribute dependency, see the Redfish User's Guide available on the support site.","Severity":"Warning"},{"Message":"Unable to modify the attribute because the attribute is read-only and depends on other attributes.","MessageArgs":[],"MessageArgs@odata.count":0,"MessageId":"IDRAC.1.6.SYS410","RelatedProperties":["BootSourceOverrideMode"],"RelatedProperties@odata.count":1,"Resolution":"Verify if the attribute has dependency on other attributes and retry the operation. To verify, view the attribute registry based on the type of resource.","Severity":"Warning"}],"code":"Base.1.2.GeneralError","message":"A general error has occurred. See ExtendedInfo for more information"}}

Similarly to the redfish_command module the "Next Boot" setting has been changed to UEFI Device Path. So the behavior seems same both manually and with redfish_command module.

However the above error message is much more useful than what the module returned:

In the past I had tried to check the "Redfish User's Guide" but could not find out more about this dependency. I will try to check it again.

I'm trying a few different systems I have available to see if things are better on newer versions of firmware. Unfortunately I do not have anything that goes back as far as 2.XX.

Only the 14th and 15th generation servers have something newer than 2.x and the newest I have available is 13th gen...

Well, nothing has really improved except for the fact that it returns with 200 with the same error message in the response body instead of 400, which is just going to hide the error completely. I'll pass this info along to others, but if you have support contacts, it would be worth providing this info to them as well.

Unfortunately I do not have any contacts whatsoever in Dell.

bluikko commented 2 years ago

While MAC(010203040506) might be valid per the UEFI spec, the given system might not enumerate all device paths in that manner. Some UEFI implementations might support that, but others might only enumerate the device from the PCIe path style of enumeration. The unfortunate thing when working with UEFI data like this is it generally requires intimate knowledge about the system in question to get it right.

I had another look at the UEFI spec, found chapter 9 and totally understood it. My thinking was that perhaps the device path I am trying to give is not valid and that could cause errors.
Unfortunately it was not useful: MAC(010203040506) (or MAC(010203040506,0)) should indeed be a valid device path as per the spec; next I was thinking about representation of that device path. It seems that this textual representation is part of the spec and may indeed be the expected format since the raw structure byte values could (apparently) not be transmitted in JSON to redfish.

In the past I had tried to check the "Redfish User's Guide" but could not find out more about this dependency. I will try to check it again.

Had another look at this also. The "Redfish User's Guide" referred in the error does not seem to exist; but "iDRAC7/8 Redfish API Reference Guide 1.0" does not list BootSourceOverrideTarget as a property to a ComputerSystem. It only lists BootSource.
Additionally, BootSourceOverrideMode is listed first time as an attribute in the version 2.60 of iDRAC document, and the installed firmware is of later version: https://www.dell.com/support/manuals/en-sr/idrac7-8-lifecycle-controller-v2.60.60.60/idrac_2.60.60.60_redfishapiguide/computersystem?guid=guid-8790c616-fe2d-4829-8c49-92e7122d66f5&lang=en-us
But changing BootSourceOverrideEnabled to BootSource made no difference.

Based on the available documentation from Dell there seems to be no explanation for this error.
I agree with your suggestion that this is an issue on Dell side - it could be a documentation issue as well if something critical is missing from the documentation.
Having looked only at the Dell documentation I have no explanation why BootSourceOverrideEnabled would ever work as it is not listed at all...

mraineri commented 2 years ago

I would fully expect the textual representation to be supported in Redfish. The mockups from the DMTF that have sample UEFI device paths show the textual format rather than the raw bytes. I suspect there may be limitations specific with iDRAC in terms of what types of UEFI device paths you can convey (perhaps the MAC(XXXXXXXX) format simply isn't support by iDRAC.

There wouldn't be much on this specific subject in the Redfish User Guide. Generally speaking, if you're providing the full desired configuration with correct values, it should be accepted. The error you're encountering is a very iDRAC-specific limitation. I suspect this type of boot override is not well used, hence why it's fairly broken at the moment.

mraineri commented 2 years ago

I've been pointed to the following script as an example: https://github.com/dell/iDRAC-Redfish-Scripting/blob/7d1e39ab6277bb09da7de6d0e842acbbd25a2cce/Redfish%20Python/SetNextOneTimeBootDeviceREDFISH.py

One thing I noticed is the BootSourceOverrideEnabled is not specified in the request. Removing that property from the curl request I built allows the PATCH to go through. This is rather unfortunate since my expectation is if a user wants to configure a one-time boot, they'd set BootSourceOverrideEnabled to Once... This restriction looks to be specific for UEFI-specific boot override options (such as UefiTarget).

bluikko commented 2 years ago

One thing I noticed is the BootSourceOverrideEnabled is not specified in the request. Removing that property from the curl request I built allows the PATCH to go through.

Yes, this works.

curl -k -u USER:PASS -H "Content-Type: application/json" -X PATCH 'https://192.0.2.1/redfish/v1/Systems/System.Embedded.1' -d '{"Boot": {"UefiTargetBootSourceOverride": "VenHw(01020304-0102-0102-0102-010203040506)", "BootSourceOverrideTarget": "UefiTarget"}}'

Unfortunately it results in a Lifecycle Controller Configuration Job (as listed in the OP), which is time-consuming as the server takes quite a while to boot even 1 time. It seems to be unavoidable in UEFI mode.

This is rather unfortunate since my expectation is if a user wants to configure a one-time boot, they'd set BootSourceOverrideEnabled to Once... This restriction looks to be specific for UEFI-specific boot override options (such as UefiTarget).

Even without BootSourceOverrideEnabled: Once the boot override affects only the next boot (at least on this 13th gen Dell), meaning a one-time boot is configured.

So the redfish_command module should not set the BootSourceOverrideEnabled parameter.
But I have looked at both Lenovo and HP Redfish references and they do support this parameter if I am not mistaken - so how to keep this module working across different manufacturers.
For example HP has a "Note" at https://support.hpe.com/hpesc/public/docDisplay?docId=a00118967en_us&docLocale=en_US&page=GUID-DE9B5289-443B-494A-9980-0F023C365F3D.html that says UefiTargetBootSourceOverride must be updated before BootSourceOverrideTarget - not sure how that works in the real world since I do not have HP servers to test on.

Or, perhaps other servers work similarly to the Dell and setting BootSourceOverrideEnabled is not necessary to get a one-time boot using UefiTarget?

mraineri commented 2 years ago

One thing I noticed is the BootSourceOverrideEnabled is not specified in the request. Removing that property from the curl request I built allows the PATCH to go through.

Yes, this works.

curl -k -u USER:PASS -H "Content-Type: application/json" -X PATCH 'https://192.0.2.1/redfish/v1/Systems/System.Embedded.1' -d '{"Boot": {"UefiTargetBootSourceOverride": "VenHw(01020304-0102-0102-0102-010203040506)", "BootSourceOverrideTarget": "UefiTarget"}}'

Unfortunately it results in a Lifecycle Controller Configuration Job (as listed in the OP), which is time-consuming as the server takes quite a while to boot even 1 time. It seems to be unavoidable in UEFI mode.

That's how Dell implemented that capability; boot overrides result in a LC job that are consumed by BIOS on the next reset. Unfortunately that's outside of Ansible's control and is part of the system design.

This is rather unfortunate since my expectation is if a user wants to configure a one-time boot, they'd set BootSourceOverrideEnabled to Once... This restriction looks to be specific for UEFI-specific boot override options (such as UefiTarget).

Even without BootSourceOverrideEnabled: Once the boot override affects only the next boot (at least on this 13th gen Dell), meaning a one-time boot is configured.

So the redfish_command module should not set the BootSourceOverrideEnabled parameter. But I have looked at both Lenovo and HP Redfish references and they do support this parameter if I am not mistaken - so how to keep this module working across different manufacturers. For example HP has a "Note" at https://support.hpe.com/hpesc/public/docDisplay?docId=a00118967en_us&docLocale=en_US&page=GUID-DE9B5289-443B-494A-9980-0F023C365F3D.html that says UefiTargetBootSourceOverride must be updated before BootSourceOverrideTarget - not sure how that works in the real world since I do not have HP servers to test on.

Or, perhaps other servers work similarly to the Dell and setting BootSourceOverrideEnabled is not necessary to get a one-time boot using UefiTarget?

Specifically for Dell, Dell doesn't require setting the BootSourceOverrideEnabled property when performing a UEFI target one-time boot; it does this automatically for you. But this goes against expectations from the spec; BootSourceOverrideEnabled is its own property and can be set independent from other properties (like most of anything on a RESTful interface). In addition, it takes control away from the user for how to configure that override option; maybe they want to queue it up before setting it from "Disabled" to "Once" to double-check settings.

Just thinking about things from a pure REST perspective, if a user has a set of properties to configure on a resource, it will provide each of those desired properties; how that ultimately gets configured underneath the covers is up to the server. Forcing a user to perform things in two steps is a bad design choice.

However, looking at that HPE spec I'm wondering if they're really stating you configure one in its own request and then send another request to set the other property; I think it can also be interpreted as "If you want to use UefiTarget, you also need to ensure UefiTargetBootSourceOverride is configured as well". Sending a PATCH to set UefiTarget while keeping UefiTargetBootSourceOverride as null I can certainly see as something that can be rejected (in this case the user is performing a misconfiguration). I can certainly test this out on other systems and reach out to HPE folks about this.

mraineri commented 2 years ago

I was able to confirm that on HPE systems, providing both properties in the same request is valid. The documentation is supposed to indicate that you cannot have a target of "None" when specifying UefiTarget; as long as you're also providing the UEFI device path to configure the property away from None, it'll go through properly.

bluikko commented 2 years ago

Is there any hope to make the module work on such Dells? It may be impossible since the module would not know that the target is a Dell server that requires not using BootSourceOverrideEnabled?

If it cannot be fixed reasonably then I guess this should be closed.

mraineri commented 2 years ago

I'll try to think of ways to work around this. Nothing obvious comes to mind at this moment without performing additional GET operations to inspect version numbers and the vendor info. However, this behavior does need to be corrected by Dell.

mraineri commented 2 years ago

@bluikko as part of refactoring work I did, I added changes to retry the request (with properties removed) if it fails and detects the system is a Dell system. Would you be able to try out the latest changes in the main branch? Unfortunately I don't have access to a 13G system to verify the change myself, but I was able to instrument a mockup to see if my changes had the desired effect.

bluikko commented 2 years ago

Would you be able to try out the latest changes in the main branch?

I can test with some different Dell systems including 13th gen. First I will need to find out how can I get the testing version of the module... Is it possible to just temporarily replace some files with the master branch versions? Would you happen to have a link to a website that would discuss testing these modules?

mraineri commented 2 years ago

This should install the module from the main branch:

ansible-galaxy collection install git@github.com:ansible-collections/community.general.git,main --force

And then when done, you can revert to a particular version with this:

ansible-galaxy collection install community.general:5.8.0 --force

ansibullbot commented 2 years ago

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

bluikko commented 1 year ago

Finally I would have a chance to test this. Are the changes in #5425? I would need to pick just these changes and not the full main branch unfortunately.

mraineri commented 1 year ago

@bluikko yes, the changes are in that PR.

ansibullbot commented 1 year ago

cc @TSKushal click here for bot help

ansibullbot commented 1 year ago

cc @jyundt click here for bot help