ibm-openbmc / dev

Product Development Project Mgmt and Tracking
16 stars 2 forks source link

GUI : System Logs #2626

Closed derick-montague closed 3 years ago

derick-montague commented 4 years ago

SME

BMC: @santoshpuranik RAS: George Ahrens PHYP:

Key conversations

Research / Design

Summary

There are multiple log types and it is not clear if they will be able to live within the same table or even within the same page. Redfish has a logs service and each log type listed is a different service call. We will need to determine if the data in the response will work in the same table structure. Another potential issue is performance, having all of these logs within one table will require calling, consolidating, and sorting all the responses from the different log service API calls.

Refer to the summary section System Logs Epic for detail about each log type #1523

User Story

As a I need to in order to

InVision Prototype

Tasks

Notes

Audit Logs

I'd recommend separating out things like the machine power state changing, from the user triggering it, so Support can find out when a box crashed/ went down/ came up, in relation to the problem. My plan to deal with Audit logs was to have them in their own LogService, so when we collect service data we don't have to scrub the logs for GDPR compliance, just not collect the audit log. ~ Justin Thaler

Maintenance Logs

This is another way to get to the following screens: Deconfiguration (Guard) Records and Hardware Serviceable Events screens ~ Nicole Conser

Previous Boot Progress Indicator & Progress Indicator History

Even though these are separate menu items in ASMI, they are covered by one Redfish LogService. ~Gunnar: 7/15 Slack convo

Event/Error logs See the discovery box notes for a link to a recorded call on 1/12/21 with the SEET group and link to slack conversation in design channel.

References

FED

UX Flow / Interaction Requirements

cURL Commands and Redfish Response

Notes

UI Checklist

Browser Tests (Chrome, Firefox, Safari (Mac), Edge (Windows))

Accessibility Tests

Test Hooks

derick-montague commented 3 years ago

Audit logs may be pushed out.

nicoleconser commented 3 years ago

@spinler gave an update on Event Log Display in a recent SEET meeting… I listened to the first ~40 mins and took some notes — see notes + recording URL in Slack. @yoshiemuranaka @derick-montague

derick-montague commented 3 years ago

@spinler gave an update on Event Log Display in a recent SEET meeting… I listened to the first ~40 mins and took some notes — see notes + recording URL in Slack. @yoshiemuranaka @derick-montague

Thank you @nicoleconser. I also added a Systems Logs discovery doc and folder and linked in the description. I have included the recording info there also since only the design team has access to that slack conversation.

derick-montague commented 3 years ago

Slack conversation about logs timeline

"IBM is trying to hire someone to work on the event log stuff, right now there is nobody to do any changes"

ParishrutB commented 3 years ago

An ongoing discussion about how progress codes implementation...

Summary:

derick-montague commented 3 years ago

I don't believe that this is correct Progress/Boot Logs: /redfish/v1/Managers/BMC/LogServices/Progress. I believe it is Progress/Boot Logs: /redfish/v1/Managers/BMC/LogServices/PostCodes/Entries (related #1303, #2843). There will be a property that the use can download the full file.

{
  "@odata.id": "/redfish/v1/Systems/system/LogServices/PostCodes/Entries",
  "@odata.type": "#LogEntryCollection.LogEntryCollection",
  "Description": "Collection of POST Code Log Entries",
  "Members": [
    {
      "@odata.id": "/redfish/v1/Systems/system/LogServices/PostCodes/Entries/B1-1",
      "@odata.type": "#LogEntry.v1_4_0.LogEntry",
      "Created": "2021-02-09T01:38:32+00:00",
      "EntryType": "Event",
      "Id": "B1-1",
      "Message": "Boot Count: 1: TS Offset: 0.0000; POST Code: 0x4331303031463030",
      "MessageArgs": [
        "1", <--- Boot Count
        "0.0000", <--- Time Stamp Offset
        "0x4331303031463030" <--- Progress Code
      ],
      "MessageId": "OpenBMC.0.1.BIOSPOSTCode",
      "Name": "POST Code Log Entry",
      "Severity": "OK",
      "AdditionalDataURI" : "/redfish/v1/Systems/system/LogServices/PostCodes/attachment/B1-1"
    }

Also, the PEL will be added to the AdditionalDataURI property and we need to be able to download that from the GUI. You can follow the entire conversation in slack

ParishrutB commented 3 years ago

Waiting to get some data from backend so that we can mock up the GUI - Reference chat

derick-montague commented 3 years ago

We are blocked waiting on backend to determine what will make it into the Redfish spec. Pari and Priyanka are going to create low fidelity wire frames to help the engineering team understand user needs the design team discovered.

derick-montague commented 3 years ago

@ParishrutB and @priyanka-pillai97 discussed a concern today with decisions being discussed. The BMC SME is @spinler and I have started a conversation in slack that I will document the outcomes of that conversation here.

There is discussion that if there is a HW issue that triggers an event, that event will include only the location code and not include the Part # or Serial #. It has been suggested that the user will have to look at the VPD to determine this information. This puts the following user need at risk.

As a user that is verifying that a FRU has been replaced,
I need the serial number, part number, and location code in the event log created by the faulty FRU
So that I can compare the serial and/or part number for the FRU in the location sent in the event log

If this is an issue of scope, we understand that constraint and would like to acknowledge this will change the users workflow. If the user is not able to get the part number and/or serial number from the event log, they will have to:

  1. Get the location code from the event log
  2. Go to the HW Inventory and Status page and find the FRU at that location code
  3. Write down the Part and/or Serial numbers
  4. Use the documented VPD to later compare and verify the successful replacement of the FRU

The users included in research verified the importance of including the VPD info in the Event log for completing this use case efficiently and effectively.

ParishrutB commented 3 years ago

Event logs - as per Matt Spinler (reference) - RAS team still needs to get back regarding how Service Action Flag will be implemented - whether it will be an OEM field or will be integrated with Severity field.