Add service allowing to retrieve (active) error & alarm details

gavanderhoorn commented 1 year ago

Implement a service (perhaps motoros2_interfaces/GetActiveAlarmInfo or similarly named) which clients could invoke to retrieve more information about active alarms/errors.

We could expose other data next to alarm subcode data. If available, we could perhaps have it return string descriptions of alarms & errors and other information (possible remedies?).

We can't add this to our RobotStatus publications, as that message doesn't have the necessary fields (and they can't be added either, as it should stay generic / vendor-neutral).

Edit: note that this service is not intended to be a replacement of RobotStatus. Retrieving the information returned by the service incurs overhead, and should really only be used to get more insight into alarms/errors reported via RobotStatus publications.

gavanderhoorn commented 1 year ago

Some quick thoughts on a possible design.

GetActiveAlarmInfo.srv:

Click to expand

```yaml # empty request (service returns information about all active alarms & errors). # # We could perhaps support 'filtering' by allowing users to request details # for a subset of active alarms only. --- # Each entry in this list provides detailed information about all currently # active alarms. If this list is empty, there are no active alarms. # # Note: order of entries in this list does not encode for any specific severity # or priority of active alarms. motoros2_interfaces/msg/AlarmInfo[] alarms # Alarm Number int32 number # Sub Code int32 sub_code # Alarm Name/Message string name # Contents string contents # Meaning string description # Cause, Remedy and Notes for Cause-Remedy pairs motoros2_interfaces/msg/CauseRemedyPair[] cause_remedy # Cause string cause # Remedy string remedy # Notes string notes # Each entry in this list provides detailed information about all currently # active errors. If this list is empty, there are no active errors. # # Note: order of entries in this list does not encode for any specific severity # or priority of active errors. motoros2_interfaces/msg/AlarmInfo[] errors # identical to 'alarms' above [..] ```

This service returns two lists with instances of AlarmInfo, one instance per active alarm and/or error.

I've deliberately made errors a list, even though there can only be one active error at any specific time (AFAIK). Reason is it would otherwise be difficult to communicate "no active error" (it would require setting all fields of the singular instance of AlarmInfo to 0 or "", which is doable, but not as nice as just not having an AlarmInfo instance in the first place).

With errors being a list, it's just a simple check for len(errors) == 0 -> no active errors.

Same for alarms.

re: CauseRemedyPair.msg: I'm not sure we should include this. I'm not happy about the name, and I believe at some point we could require users to lookup information about alarms & errors in the available (off-line) documentation.

re: AlarmInfo.msg: all these string fields will make handling memory and message initialisation somewhat more complex than just a couple of primitives, but it's probably worth it.

gavanderhoorn commented 12 months ago

Skeleton: https://github.com/Yaskawa-Global/motoros2/compare/main...gavanderhoorn:motoros2:get_active_alarm_info.

gavanderhoorn commented 12 months ago

@ted-miller: would you have any recommendations as to which M+ APIs to use to retrieve the data we'd need for this service?

ted-miller commented 11 months ago

Unfortunately, there aren't any public API's available for retrieving the alarm message. We may be able to obfuscate something in the platform lib. But I'm hesitant because even I don't have any documentation on that functionality. Additionally, the approach I'm considering would not work on DX2/FS.

There is no API at all for getting the cause-remedy info.

There is a possible alternate approach. The alarm message and also the cause-remedy info are both stored in a csv file on the standard pendant. We could potentially open an FTP connection to the pendant and fetch that file. However, the Smart Pendant uses an encrypted FTP connection (I have no idea how that works).

Something else to consider is the fact that none of the MotoROS2 alarms have cause-remedy info stored on the controller.

gavanderhoorn commented 11 months ago

Skeleton: main...gavanderhoorn:motoros2:get_active_alarm_info.

it's slightly less skeleton-y now.

Needs an updated M+ libmicroros with the GetActiveAlarmInfo service definition. We have a build artefact available in the libmicroros builder in case you're interested @ted-miller.

Edit: btw, the cause/remedy parts are commented at the moment.

gavanderhoorn commented 11 months ago

Unfortunately, there aren't any public API's available for retrieving the alarm message. We may be able to obfuscate something in the platform lib. But I'm hesitant because even I don't have any documentation on that functionality. Additionally, the approach I'm considering would not work on DX2/FS.

hm.

I'll take a look a later to see whether I can find something for those latter two.

There is no API at all for getting the cause-remedy info.

Other than reading the .csvs you mention below, right?

There is a possible alternate approach. The alarm message and also the cause-remedy info are both stored in a csv file on the standard pendant. We could potentially open an FTP connection to the pendant and fetch that file. However, the Smart Pendant uses an encrypted FTP connection (I have no idea how that works).

I don't believe there's support for that (ie: SFTP) in M+ (or anything below that).

The .csvs are already on the controller's file system, are they not?

Something else to consider is the fact that none of the MotoROS2 alarms have cause-remedy info stored on the controller.

well, that's somebody-else's problem ;)

ted-miller commented 11 months ago

There is no API at all for getting the cause-remedy info.

Other than reading the .csvs you mention below, right?

Yeah. I don't consider parsing a csv to be an official "api".

I don't believe there's support for that (ie: SFTP) in M+ (or anything below that)

No. I've implemented regular FTP in M+ before. But not SFTP.

The .csvs are already on the controller's file system, are they not?

Very bad things happen when you touch that file system. I will not access that SD card.

gavanderhoorn commented 10 months ago

After some deliberation and thinking about it more, my new proposal would be as follows (I've removed some of the comments to reduce verbosity):

Click to expand

```yaml # empty request --- uint32 result_code string message # Each entry in this list provides detailed information about all currently # active alarms. If this list is empty, there are no active alarms. # # Note: order of entries in this list does not encode for any specific severity # or priority of active alarms. motoros2_interfaces/AlarmInfo[] alarms # Alarm Number uint32 number # Alarm Name/Message string name # Sub Code int32 sub_code # reserved for future use: contents (from CSV) string contents # reserved for future use: meaning (from CSV) string meaning # reserved for future use: potential causes and their potential remedies (from CSV) motoros2_interfaces/AlarmCauseRemedy[] help # Cause string cause # Remedy string remedy motoros2_interfaces/AlarmInfo[] errors # identical to 'alarms' field above ... ```

I'd like to make progress and implement just the minimal <code, name, sub_code>-returning version of the service.

That would allow users access to error/alarm information while we can take some time to figure out how to implement the additional functionality (ie: retrieve more detailed error/alarm descriptions and information about possible causes and remedies).

The contents, description and help fields would just be empty/not set in the initial implementation.

Note: I've removed the notes field from AlarmCauseRemedy as I don't believe it adds a lot of value. In the .csvs I've checked, it seems to contain an identifier for a 'category' or 'subsystem' responsible for raising the alarm. This could be valuable information, but some of the .csvs appear to have Japanese text in that column for some rows, which would complicate storing it in a ROS msg (it's possible though would need a different field type).

@ted-miller: if you agree with this proposal, I'll update Yaskawa-Global/motoros2_interfaces#9 and start on finalising the discussed functionality.

gavanderhoorn commented 10 months ago

Something else to consider: I find the name of the name field rather confusing. The .csvs call it "Alarm Name/Message" and the latter seems more logical to me.

I always have to remember that "name == message".

ted-miller commented 10 months ago

I find the root.message to be unintuitive. My first thought is that this should be the alarm message. I get why it's there... just maybe rename it to serviceresult or something similar-ish.

I always have to remember that "name == message"

I would suggest renaming the name field to message.

reserved for future use: contents (from CSV) reserved for future use: meaning (from CSV)

I've never had a clear understanding of the distinction between these fields. I'd suggest combining into one. We can just concat the contents of the csv file.

motoros2_interfaces/AlarmInfo[] errors identical to 'alarms' field above

There is no csv file for error information. I think these should just be a number and message. (I'm not sure whether errors have subcodes.)

gavanderhoorn commented 10 months ago

I find the root.message to be unintuitive. My first thought is that this should be the alarm message. I get why it's there... just maybe rename it to serviceresult or something similar-ish.

yeah, so the thing is we have this pair of fields (result_code, message) in some of our other service definitions (all our IO services fi). I'd like to see whether we can keep things consistent across different services.

I'd be OK with result_message (as the complement of result_code).

But then I'd also like to rename the field in other definitions.

I always have to remember that "name == message"

I would suggest renaming the name field to message.

:+1:

reserved for future use: contents (from CSV) reserved for future use: meaning (from CSV)

I've never had a clear understanding of the distinction between these fields. I'd suggest combining into one. We can just concat the contents of the csv file.

hm. Ok. I'll take a look whether that works in general. There's quite some variation in those .csvs.

motoros2_interfaces/AlarmInfo[] errors identical to 'alarms' field above

There is no csv file for error information. I think these should just be a number and message.

I liked being able to reuse the same AlarmInfo struct for both.

I'll prototype an ErrorInfo type.

I'm not sure whether errors have subcodes.

the internal API definitely suggests they do.

gavanderhoorn commented 10 months ago

@ted-miller: I've updated the message definitions in https://github.com/Yaskawa-Global/motoros2_interfaces/pull/9.

Only thing I haven't done I realise now is:

reserved for future use: contents (from CSV) reserved for future use: meaning (from CSV)

I've never had a clear understanding of the distinction between these fields. I'd suggest combining into one. We can just concat the contents of the csv file.

We could still decide to do that.

I'll take a look at the .csvs to see whether it would indeed make sense / work.

ted-miller commented 10 months ago

I've never had a clear understanding of the distinction between these fields. I'd suggest combining into one. We can just concat the contents of the csv file.

Ignore that comment. I was thinking of something else.

They should be separate.

Yaskawa-Global / motoros2

Add service allowing to retrieve (active) error & alarm details #14