esi / esi-issues

Issue tracking and feature requests for ESI
https://esi.evetech.net/
207 stars 23 forks source link

ESI Response Provenance #1289

Open cjslep opened 2 years ago

cjslep commented 2 years ago

Feature Request

Provide a way for ESI responses to be verifiable as originating from CCP's ESI Servers. This solves the provenance problem.

Why solve the provenance problem? This requires lengthy background. I am working on https://github.com/cjslep/dharma which is federated corporation management software. Federation lowers the barrier for different power structure dynamics to come into existence. Non-federated software is well understood in the context of existing alliances and their software tools. Federation simply allows independent, unaffiliated corporations to run the same kinds of tools with the option of interoperating with each other's federated software to obtain ad-hoc coordination capabilities that scale to size of existing alliance power structures, but with fundamentally different dynamics.

Federation via ActivityPub occurs today without problem in other domains because servers that serve data are their own authority: if a server obtains federated data, that server typically can make an equivalent HTTP request to the originating federated server to verify that the payload is authentic, because the mechanisms of ActivityPub allow dereferencing of ActivityStreams data. The following is illustrative and not quite accurate of how that protocol works:

My User | My Server | Peer Server A | Peer Server B | Peer User (Server B)
------- + --------- + ------------- + ------------- + -------
   |          |              |              |           |
   |          |              |              |           |
   |          |              |              |       +--------------+
   |          |              |              |<------| Creates Post |
   |          |              |              |       +--------------+
   |          |              |              |           |
   |          |              |     +----------------+   |
   |          |              |<----| Federates Post |   |
   |          |              |     +----------------+   |
   |          |              |              |           |
   |          |     +----------------+      |           |
   |          |<----| Federates Post |      |           |
   |          |     +----------------+      |           |
   |          |              |              |           |
   | +-----------------+     |              |           |
   | | "Is This Real?" |------------------->|           |
   | +-----------------+     |              |           |
   |          |              |              |           |
   |          |              |           +-----+        |
   |          |<-------------------------| Yes |        |
   |          |              |           +-----+        |
   |   +-------------+       |              |           |
   |<--| Notify User |       |              |           |
   |   +-------------+       |              |           |

With the introduction of ESI data in this Federated context, there is now a wealth of Eve Online related data that is not represented by ActivityStreams payloads and is not transported over ActivityPub. That is OK, except now there is a provenance problem where there is no way for peer servers to verify the integrity of data originating from the ESI API. This opens all software that is Federating-in-principle to MITM attacks.

My User | My Server | Peer Server A | Peer User (Server A) | Eve Online ESI
------- + --------- + ------------- + -------------------- + --------------
   |          |              |                  |                  |
   |          |              |                  |                  |
   |          |              |              +-------+              |
   |          |              |              | Authn |------------->|
   |          |              |              | Authz |              |
   |          |              |              +-------+              |
   |          |              |                  |                  |
   |          |              |                  |              +--------+
   |          |              |<--------------------------------| Tokens |
   |          |              |                  |              +--------+
   |          |              |                  |                  |
+---------+   |              |                  |                  |
| Plan To |   |              |                  |                  |
| Build A |   |              |                  |                  |
|  Ship   |   |              |                  |                  |
+---------+   |              |                  |                  |
   |          |              |                  |                  |
+-------+     |              |                  |                  |
| Build |---->|              |                  |                  |
|  Ship |     |              |                  |                  |
+-------+     |              |                  |                  |
   |          |              |                  |                  |
   |      +--------+         |                  |                  |
   |      |  Get   |-------->|                  |                  |
   |      | Assets |         |                  |                  |
   |      +--------+         |                  |                  |
   |          |              |                  |                  |
   |          |          +--------+             |                  |
   |          |          |  Get   |--------(As This User)--------->|
   |          |          | Assets |             |                  |
   |          |          +--------+             |                  |
   |          |              |                  |                  |
   |          |              |                  |              +--------+
   |          |              |<=Data Is Unverifiable To Peers==| Assets |
   |          |              |                  |              +--------+
   |          |              |                  |                  |
   |          |        +-----------+            |                  |
   |          |        | Federates |            |                  |
   |          |<-------|   Assets  |            |                  |
   |          |        +-----------+            |                  |
   |          |              |                  |                  |
   |     +---------+         |                  |                  |
   |<----| Missing |         |                  |                  |
   |     |  Items  |         |                  |                  |
   |     +---------+         |                  |                  |
   |          |              |                  |                  |

My server cannot verify that the peer server is behaving non-maliciously about the ESI data retrieved on behalf of its users.

I Am Not Asking For ESI To Be ActivityPub-Compatible. There are plenty of protocols that federate that are not ActivityPub. I am narrowly interested in solutions that tackle integrity & authenticity (ex: HMACs) that can be included in ESI responses. Then anyone can then verify the integrity and authenticity of an ESI payload locally (without an API call to ESI) and assure themselves that peer software is not maliciously (or otherwise) serving untruthful data.

There are many Do's and Don'ts in this space. Not all solutions are perfect and even HTTP Signatures is not yet standardized.

Use case

Any third-party software that is federated as a principle.

Authentication

No, the addition of digests or similar solutions does not impact authentication. It does not impact scope. It doesn't even impact caching, as a cached response should have the same digest of bytes that is passed into an HMAC, which deterministically also results in the same set of bytes.

There will be a very slight increase of 1) Processing power to compute digests but only when a cache miss is encountered, and 2) Memory required for caches as the HMAC result is also stored, and 3) Processing power of serving responses as it must physically write out a few more bytes per request 4) Wire size as a few more physical bytes on every request are required

Most of these can be mitigated if response-signing is made optional, such as the API only does so if a special query parameter is set.

Example return

Depends on the specific HMAC implementation and design choices for the API.

Checklist

Check all boxes that apply to this issue:

I did not mark "Use case exists" because my software is not yet ready for use in the wild, and it is honestly a catch-22: if this is never tackled, only very few people (if any) will ever consider using software that is Federated in principle due to the MITM attack. If this is tackled, then it will unblock the entire field as viable.

jowrjowr commented 2 years ago

All of this is already done, is it not?

The JWTs returned by the SSO API are signed by CCP and the public component of the signing key is public. It is documented here.

cjslep commented 2 years ago

That JWT indeed proves provenance for the SSO API. It proves that the authorization & refresh tokens received:

1) Have integrity preserved (no one manipulated the token data) 2) Have originated from CCP's authority

What is missing, is this exact same kind of proof for all other ESI APIs' JSON response payloads (characters, assets, skills, etc), what you point out is solely for the SSO API.

EDIT: To illustrate the difference, the second diagram in my OP has 2 boxes along the "Eve Online ESI" track: Tokens and Assets. Note that Tokens is just a simple line (<--) because, you are correct, the issue of provenance is already solved there -- with the caveat: the JWT for the SSO API is for a different threat model than for Federation, b/c servers should never be sharing (and that includes Federation) SSO tokens with each other. On the other hand, the Assets has no provenance, so the same line is instead "unverifiable" from the Federation model analogy.

kennethjor commented 2 years ago

I don't have huge insight into this, but I just want to say that this sounds like a really cool and interesting idea.

The article you linked specifically says don't use async crypto unless you have to, but I feel like this is just a better solution? The ESI has a lot of data on it. For instance, I scrape the static data on the ESI daily. Calling the ESI itself to verify the entire dataset would result in literally millions of requests and take hours to complete. If it was async, it could all the done locally.

If this was implemented as part of an HTTP header, it would require no changes to existing applications which don't care. For my scrapes, incorporating those signatures would require only small changes as well. I'd have to change my formats to make sure I'm storing the JSON verbatim to facilitate signature verification, but that'd also be relatively easy to do.

cjslep commented 2 years ago

Yes, and I should frame that specific article better. I included it to illustrate the challenges that faces solutions in this general problem space of integrity+authority. I don't mean for the specifics of that article to dictate the solution space for this particular issue, especially since that article is focused on signing API Requests, not API responses. I have been interpreting you to mean "asymmetric crypto" and not "async crypto", let me know if I'm making a bad assumption. The properties of symmetric crypto are easier/simpler but requires the problem to have certain characteristics (ex: an AWS API where a paying developer can privately share the symmetric key w/ the company for verifying integrity and authority), and I don't think those characteristics apply here. You are right to point this out, and thanks for doing so!

To build off of your suggestion, HTTP Signatures is a scheme very similar to what you propose. However, it solves a more fine-grained problem: "on a server representing the identity of N entities, how can an arbitrary Response from it be verified against one of those entities as authoritative (and not the other N-1 entities)?".

The jist of HTTP Signatures is:

0) Have an asymmetric public key available and shared (how to do this is not in scope) 1) The server hosting the authoritative user creates a deterministic, canonicalized "signature string". This is based solely on HTTP request/response headers, and the software at the other end of the connection is able to reproduce this, upon receiving the HTTP request/response. 2) The "signature string" is then signed with the asymmetric private key, and put into the Authorization or Signature HTTP header 3) The software that received the request/response can then re-create the "signature string" and verify its authority using the asymmetric public key. No extra requests/responses needed.

This is just the authority mechanism, the integrity guarantee comes from including the Digest header in the "signature string", so that the hash of the body of an HTTP request/response is in the HTTP header and then also a part of the signing process.

I know I spent a lot of time talking about HTTP Signatures but I would not recommend it as a solution to this particular problem for the following reasons:

0) HTTP Signatures is solving a different problem ("how to maintain provenance of N request/responses to/from 1 server w/ M identities"). CCP games is the sole authority of the ESI API, so there is only a 1-to-1 mapping. 1) HTTP Signatures does not examine it in a federated problem space, it is meant to solve point-to-point request/response between two servers only.

It is not hard to imagine solutions that could work, that are inspired from HTTP Signatures. However, great thought and care is needed to understand the ramifications and the exact guarantees being made.

An alternative solution space to this problem is treating data as content-addressible, as the content-addressible nature of a payload means that manipulating the content (integrity violation) necessarily results in a totally different content-address, which is readily apparent in a federated context. And content-addresses may also be signed to lend authority, and therefore address the provenance problem. However, those solutions tend to be more exotic in nature, so I only mention them briefly here.

Blacksmoke16 commented 2 years ago

Not to be the bearer of bad news, while this may be a cool feature to have it's unlikely to be implemented within any reasonable amount of time, if ever. I.e. there are actual bugs and more helpful impactful feature requests that have been hanging out for years.

Happy to keep this open, but I wouldn't get your hopes up and would probably go to plan B if you had plans that required it.

For reference: https://github.com/esi/esi-issues/issues/1225#issuecomment-670652097

cjslep commented 2 years ago

From a timing perspective, I'm in no rush. And from my planning perspective, this was never Plan A, more like Plan C.

A) When displaying data, provide ample warning in the UI about the unverifiableness of data coming from peers (and derived calculations). B) For those that need stronger guarantees, an ActivityPub gateway/proxying server that provides a limited set of guarantees. C) ESI support.

Happy to have at least started the conversation and put this on the radar.