A dummy transaction token - useful for debugging

ashayraut commented 3 months ago

There is no concept of placeholder or temporary token here. Sometimes the call chain breaks i.e some service in the chain doesn't propagate the token in header, either due to a bug or unexpected code change, in such case, the service that didn't receive token from caller can generate a dummy unsigned token with caller information (that missed to send token) and pass to next service. So if A -> B -> C -> D and B misses to forward the header token, then C generates a dummy token with info about B and passes to D. This increases visibility for debugging. The service at end will find that its not real token but it can audit which service broke the call chain and can involve B directly to debug. Unsigned dummy token cannot be trusted hence MUST not be used for making authorization decision and MUST be used only for auditing/logging.

arndt-s commented 3 months ago

This is awesome feedback!

Some questions that come into my mind:

I suppose C will only generate the dummy token when it doesn't need to validate it or require specific caller information from A or B which it cannot securely extract from the request. Otherwise the call from B -> C would result in an authn/z error and fail - no need for an replacement token, is it?

If B can generate dummy tokens, how can it be prevented that it does not generate a dummy token which looks like a legit token and impersonate users like this? E.g. B is compromised and generates dummy tokens to request informations from other workloads that in fact accept dummy tokens?

Maybe this question is leading to the next one:

How are dummy tokens distinguished from real ones? Is there a risk developers by mistake consider dummy tokens as trusted?

ashayraut commented 3 months ago

Otherwise the call from B -> C would result in an authn/z error and fail - no need for an replacement token, is it?

Yes. If C is validating token, then it can deny request. If C is more of an intermediate passthrough type service, then C will rely on D to deny/allow, in which case B has to send a token.

How are dummy tokens distinguished from real ones? Is there a risk developers by mistake consider dummy tokens as trusted?

Yes, that is the most important aspect. In general, as Hyrum's law suggests, once token is out there, we won't be able to control who uses it for what purpose. Our recommendation can be that - a token validator (e.g. Service D) MUST always use the claims in token ONLY IF validate signature exists. So that even if Service B signs it, D must deny request saying you (service B) isn't valid Tx Token service. Anyways, even if we don't specify the idea of dummy token here, someone (service B or C), can potentially overwrite the header value with its own Tx token string.

obfuscoder commented 3 months ago

We are actually facing this issue in our current deployments. Although we have some workloads which already receive, validate and forward TraTs, they are being called by other workloads which do not present TraTs yet and on the other end, they call other workloads which just ignore the TraTs and don't pass them along yet. So during this rollout phase (which can take months or even years) it is quite common that TraTs are getting lost frequently and no workload can actually enforce TraT validation.

However, we do not think that any workload should be able to create its own dummy TraT just to fill the gap. We think that only the TraT Service should have the authority to issue TraTs even for those cases where gaps need to be filled. This scenario is similar to presenting self-signed tokens to the TraT Service which then issues TraTs to the requestors. The TraT Service decides which client is allowed to present self-signed tokens. While deployment is progressing the number of allowed clients will be reduced over time.

ashayraut commented 3 months ago

Requesting a new Trats token from issuer always makes sense. The idea of dummy token has some value -

for example, not every service in call graph especially where 1000s of services are involved require to connect with Trats. Keeps it simple. Also latency profile of request is predictable otherwise remote call could change latency profile suddenly because propagation broke and it resulted in remote calls.

I ack that We can keep our RFC simple - suggest that an intermediate service doesn’t receive token from caller then it can request new one. However such a token may not exactly have claims like what caller might sent and hence the response from validating party can differ.

on separate note, dummy tokens helped us because when you want to do something like Trats and not everyone is propagating tokens then you need a way to identify who is not doing it and ask them to propagate. This solution for this identification problem has to be cheap otherwise 1000s of services will request tokens from Tx token service and brown out the Tx token service.

tulshi commented 3 months ago

At IETF 120, @gffletch , @PieterKas and I discussed this issue. Our collective opinion is that since this is something that is related to debugging, we should not really address this in the TraTs spec.

We talked about having a separate document that describes best practices how to rollout TraTs gradually.

ashayraut commented 3 months ago

A separate document should work fine too. That will give us opportunity to be more elaborate.

oauth-wg / oauth-transaction-tokens

A dummy transaction token - useful for debugging #108