DMTF / Redfish-Protocol-Validator

The Redfish Protocol Validator tests the HTTP protocol behavior of a Redfish service to validate that it conforms to the Redfish specification.
Other
14 stars 12 forks source link

Intermittent false RESP_HEADERS_LINK_REL_DESCRIBED_BY errors, maybe related to Task use in Account creation #86

Open blakehilliard opened 1 month ago

blakehilliard commented 1 month ago

I have a redfish service where creating an Account is done with a task, like this:

> POST '{"UserName":"rfpvddf7", "RoleId": "ReadOnly", "Password": "blahblahblah"}' to /redfish/v1/AccountService/Accounts
HTTP/1.1 202 Accepted
...
Link: <http://redfish.dmtf.org/schemas/Task.v1_7_2.json>; rel=describedby
...
{
    "@odata.etag": "\"c297aa488d7e87beb9a69bf9e9c4b0a3\"",
    "@odata.id": "/redfish/v1/TaskService/Tasks/7a93bf9e30e74282b650282f887a2071",
    "@odata.type": "#Task.v1_7_2.Task",
    "Id": "7a93bf9e30e74282b650282f887a2071",
    "Messages": [],
    "Name": "Task 7a93bf9e30e74282b650282f887a2071",
    "Payload": {
        "HttpOperation": "POST",
        "JsonBody": "{\"UserName\": \"rfpvddf7\", \"RoleId\": \"ReadOnly\", \"Password\": null}",
        "TargetUri": "/redfish/v1/AccountService/Accounts"
    },
    "StartTime": "2024-10-08T14:29:29Z",
    "TaskMonitor": "/redfish/v1/TaskService/Tasks/7a93bf9e30e74282b650282f887a2071/Monitor",
    "TaskState": "New",
    "TaskStatus": "OK"
}

> GET /redfish/v1/TaskService/Tasks/7a93bf9e30e74282b650282f887a2071/Monitor
HTTP/1.1 201 Created
...
Link: <http://redfish.dmtf.org/schemas/ManagerAccount.v1_11_0.json>; rel=describedby
...
{
    "@Redfish.WriteableProperties": [
        "Password"
    ],
    "@odata.etag": "\"70fdff9c63bfcf9651c642cce13f4d2e\"",
    "@odata.id": "/redfish/v1/AccountService/Accounts/rfpvddf7",
    "@odata.type": "#ManagerAccount.v1_11_0.ManagerAccount",
    "AccountTypes": [
        "Redfish"
    ],
    "Enabled": true,
    "Id": "rfpvddf7",
    "Keys": {
        "@odata.id": "/redfish/v1/AccountService/Accounts/rfpvddf7/Keys"
    },
    "Links": {
        "Role": {
            "@odata.id": "/redfish/v1/AccountService/Roles/ReadOnly"
        }
    },
    "Locked": false,
    "Name": "User Account rfpvddf7",
    "Password": null,
    "RoleId": "ReadOnly",
    "UserName": "rfpvddf7"
}

Sometimes, Redfish-Protocol-Validator fails because it claims that /redfish/v1/AccountService/Accounts fails the test case RESP_HEADERS_LINK_REL_DESCRIBED_BY Link header: Services shall return the Link header containing rel=describedby on GET and HEAD requests. However, I can see that we always provide the Link header with that collection. So the first issue is that the error message is wrong when this tool hits whatever problem is causing the failure.

I noticed that when the test passes, I can see it creates a new account, gets the task to see that it is created, then does some other operations on the account. Here is a snippet from our log of https requests showing what it is doing with an Account for a passing run:

[14:51:03] "POST /redfish/v1/SessionService/Sessions HTTP/1.1" 201 558
[14:51:04] "DELETE /redfish/v1/SessionService/Sessions/Administrator1cbc22ce9695481f9d3c888abf583863 HTTP/1.1" 204 -
[14:51:04] "GET /redfish/v1/AccountService/Accounts/rfpv82c5 HTTP/1.1" 200 552
[14:51:04] "POST /redfish/v1/AccountService/Accounts HTTP/1.1" 202 623
[14:51:09] "GET /redfish/v1/TaskService/Tasks/7fa23010f20045698034ac88c9915471/Monitor HTTP/1.1" 201 552
[14:51:09] "GET /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 200 552
[14:51:09] "GET /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 200 552
[14:51:09] "GET /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 200 552
[14:51:09] "PATCH /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 400 613
[14:51:10] "GET /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 200 552
[14:51:10] "PATCH /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 400 613
[14:51:10] "GET /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 200 552
[14:51:10] "PATCH /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 400 576
[14:51:10] "GET /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 200 552
[14:51:10] "PATCH /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 200 552
[14:51:10] "GET /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 200 552
[14:51:10] "PATCH /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 412 554
[14:51:10] "DELETE /redfish/v1/AccountService/Accounts/rfpvbcc8 HTTP/1.1" 204 -
[14:51:15] "POST /redfish/v1/SessionService/Sessions HTTP/1.1" 201 558
[14:51:15] "DELETE /redfish/v1/SessionService/Sessions/Administratorc4888a2787bb49cda92a71dca3248781 HTTP/1.1" 401 540
[14:51:15] "DELETE /redfish/v1/SessionService/Sessions/Administratorc4888a2787bb49cda92a71dca3248781 HTTP/1.1" 204 -

But when I watch our log of https requests on a failing case, I see it never checks the task for the Account creation:

[14:41:19] "POST /redfish/v1/SessionService/Sessions HTTP/1.1" 201 558
[14:41:20] "DELETE /redfish/v1/SessionService/Sessions/Administrator59a5aa56749b417da5936c9ca2c7290c HTTP/1.1" 204 -
[14:41:20] "POST /redfish/v1/AccountService/Accounts HTTP/1.1" 202 623
[14:41:25] "POST /redfish/v1/SessionService/Sessions HTTP/1.1" 201 558
[14:41:26] "DELETE /redfish/v1/SessionService/Sessions/Administrator445c624a444b4bf4b9148b49dc538444 HTTP/1.1" 401 540
[14:41:26] "DELETE /redfish/v1/SessionService/Sessions/Administrator445c624a444b4bf4b9148b49dc538444 HTTP/1.1" 204 -

I do not know why the tool behaves differently in these cases.

I also notice that, regardless of whether it passes or fails, I see this in the output of Redfish-Protocol-Validator:

ERROR:root:Accounts collection could not be read
ERROR:root:Accounts collection could not be read
ERROR:root:Accounts collection could not be read

In a failing case, I see this additional line:

ERROR:root:GET on /redfish/v1/TaskService/Tasks/9bbb588ede8942118c99f43494f5eae3/Monitor caused ConnectionError exception; building HTTP 600 response
ERROR:root:Accounts collection could not be read
ERROR:root:Accounts collection could not be read

But my log shows no attempt to GET that task monitor.

mraineri commented 1 month ago

Can you provide the HTML report and debug output (with --log-level DEBUG)?

The test itself does not check POST responses or task monitor responses. It only checks the GET and HEAD responses for the following resources:

blakehilliard commented 1 month ago

The debug output: rf_protocol_validator.log

blakehilliard commented 1 month ago

The HTML report (renamed .log so github would let me attach it): RedfishProtocolValidationReport_10_08_2024_160057.log

mraineri commented 1 month ago

I see in the HTML report it's showing an "HTTP 600" status code, which is a dummy code we use when the lower requests layers are throwing exceptions.

It looks like there is a connection being dropped (by the service or transient network issue) according to the debug output. This happens after the POST operation to make the new account on the first GET to check the task status:

DEBUG:urllib3.connectionpool:https://REDACTED:443 "POST /redfish/v1/AccountService/Accounts HTTP/1.1" 202 None
ERROR:root:GET on /redfish/v1/TaskService/Tasks/f00ef35a960d4798b783edaef98664a9/Monitor caused ConnectionError exception; building HTTP 600 response
DEBUG:root:response status = 600, method = GET, uri = /redfish/v1/AccountService/Accounts, resource_type = None, request_type = RequestType.NORMAL

I suspect because in this context we're performing a GET, and the original operation is on /redfish/v1/AccountService/Accounts, the task monitor polling is getting lumped into the results tracking. That behavior is not really correct, but the source of the issue is the fact we're taking exceptions when accessing the service.

blakehilliard commented 1 month ago

It looks like there is a connection being dropped (by the service or transient network issue)

IMO this is unlikely. This happens, in this exact spot, around 50% of the time we run this tool. But we never have problems with dropped connections in any other context.

mraineri commented 1 month ago

The log output shows it right here:

ERROR:root:GET on /redfish/v1/TaskService/Tasks/f00ef35a960d4798b783edaef98664a9/Monitor caused ConnectionError exception; building HTTP 600 response

That funnels through requests.get, which can throw ConnectionError when it can't communicate with the target server. This spot is also unique in that I don't see anything else in the log that indicates a task monitor is produced.