decentralized-identity / didcomm-messaging

https://identity.foundation/didcomm-messaging/spec/
Apache License 2.0
165 stars 57 forks source link

Can we define a standard error format? #83

Closed kdenhartog closed 2 years ago

kdenhartog commented 4 years ago

In DIDCommV1 we defined a semi-standard format for error reporting in RFC 0035. I think having a standard format would be useful in the cases where we can send errors back. This would help in creating a deterministic path for fail cases by adding explicit responses when the recipient wishes to.

What I'm thinking is that we do something like this:

{
    "id":"urn:uuid:ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
    "type":"https://didcomm.org/generic_error",
    "from":"did:example:recipient1",
    "created_time":1516269022,
    "body":{
         "code": 403,
         "details": "Unable to validate authenticate origin of message",
         "messageId": <original-message-that-caused-problem>
     }"
}
OR13 commented 4 years ago

please god.... but before doing so, I think it's wise to add more reasons to the list:

Basically, distributed systems, logging and debugging is hard (i think we all know this).

There are a bunch of commercial products that handle this today, pretty much every cloud vendor has their own version, and then there are specific kinds for Kafka, K8s, etc....

Also... error logging is a huge security issue... especially depending on the content of the error message... so while it feels nice to have a standard format.... it feels bad (man) when its used to publish sensitive failures to centralized infrastructure...

I'm not sure what the best approach for didcomm would be... but it feels like the kind of thing, we might never get right... so at the very least, it should be versioned....

OR13 commented 4 years ago

one thing is certain... expect logging to happen, and expect people to dump those logs into existing systems... without doing much in the way of sanitization...

TelegramSam commented 4 years ago

Errors are on deck for Monday meeting discussion.

dhh1128 commented 3 years ago

I have created PR #232 , which partially addresses this. Maybe.

I kept the name "problem report" instead of the name "error" (there's a note about why; I could be talked into changing this to "error" if need be). I made the structure much simpler that the old "problem report" thing from DIDComm v1, per Kyle's start above. For the concepts I kept, I also tried to describe them in way less words than we did before.

TelegramSam commented 2 years ago

Closed after discussion in WG 20211101

tmarkovski commented 2 years ago

I just found out about this issue on the DIDComm call. A standard error format can be adopted based on RFC 7807 Problem Details which may bring the DIDComm spec to a bit more interop with other ecosystems. Problem Details is widely used.

kdenhartog commented 2 years ago

I like the idea of going with RFC7807 here, but wasn't aware of it before now. Thanks for raising it @tmarkovski

@dhh1128 what's your thoughts on reusing this instead of the structure we were considering?

dhh1128 commented 2 years ago

The fact that none of us have heard of RFC 7807 until now begs the question:

How widely supported is it?

Just because an RFC exists doesn't mean we should try to use it.

If it is widely supported, then supporting it might be worth exploring. If it is not, then I think we should keep what we have; I think doing otherwise would just be needless tinkering.

I did a quick web search. I find that ASP.NET 2.1 core supports RFC 7807, and so does Zend in PHP. There is ticket to add it in Spring that's been open for more than a year, that has some good commentary. Maybe others can help quantify this better.

Assuming we conclude that it's interesting due to interop potential, then we need to decide how to make it work. The fundamental issue is that an HTTP problem details structure is designed to be characterized by an HTTP status code, and has an internal structure that conflicts with DIDComm messages. DIDComm messages don't always flow over HTTP, so we'd be applying an HTTP standard in contexts that don't fit that transport very well; a bit of a square peg in round hole. Not a deal breaker, but worth pondering... The type field is an interesting overlap and resonance (easy to use and get that working because it means the same thing as DIDComm expects) -- but the big issue is DIDComm v2 expects a body with structure that varies by type -- whereas RFC 7807 wants structure in fields that are siblings to type. That is: what DIDComm v2 would call "headers."

I don't think these are unresolveable problems. We could work on this if there's a will to do so. But we shouldn't lose key elements of our design just to make this somewhat awkward adaptation work.

dhh1128 commented 2 years ago

I can't find any evidence that Splunk understands RFC 7807. I can't find any evidence that big CDNs like CloudFlare or Akamai support it, either. This 3-year-old blog post touts the RFC, but points out that Google, Facebook, and Spotify all fail to support it in their REST APIs: https://blog.restcase.com/rest-api-error-handling-problem-details-response/

Maybe there's been progress on this lately?

dhh1128 commented 2 years ago

This comment from a developer at Reddit, explaining why they haven't felt motivated to support the RFC, is another data point: https://www.reddit.com/r/api/comments/hze1i2/do_you_folks_use_rfc_7807_problem_details_in_http/

kdenhartog commented 2 years ago

Yeah seems like it's not as adopted as originally intended. I'm good with not going forward with straight adoptions of this RFC then, and if I get some time to read through it properly and see if there's some improvements we could garner from their designs than I'll open a separate issue or PR to address those additions. Otherwise, you've convinced me that the change likely isn't worth the effort. Thanks for doing the due diligence on this to investigate if it's worth using.