Questions regarding UCAN findings

Hi @fabricedesre!

I wanted to ask / start conversation on some of your UCAN findings. Hope this is a good place to do so:

Each one is associated with a human readable label to allow easy selection in UI dialogs. The system also creates a default "superuser" DID that can be used by core apps to avoid prompts in these apps: by using the OS you implicitely trust its core apps.

I’m bit confused here is each DID rough equivalent of a user in unix system ?

To me it seems that core apps having access to “superuser” DID is probably not ideal, maybe instead they odd to just be allowed to skip user prompts ? That potentially would allow user to revoke core apps access e.g. by going to “system settings”. I also imagine not all core apps need access to all the resources, so having separate DIDs / UCANs there might also be a good way to lock that down.

However nothing prevents a token created when browsing site_a.com to be transmitted to another origin and presented by site_b.com. This is quite a departure from the usual web model, even if somewhat expected for a distributed system with a different trust model. Informing users of edge cases looks like something useful: we could bind tokens to the origin they've been granted from, and when reused from a different origin prompt the user: "You granted access to /pictures/ to site_a.com, but site_b.com is requesting to use that permission. [ok] [hell no]".

I would argue this may not be that different from what you can do on the web platform or specifically from site_b.com embedding site_a.com into hidden iframe and using message passing to invoke capabilities.

I think what might help empower user is capturing all the UCANs in rotation in some “system access” interface so user is always able to revoke access. This way user participation can be more indirect, instead of prompting every time cross origin delegation occurs user can be shown a more passive status indicator that they can act upon (Apple does that sort of a think when geolocation, microphone, webcam etc… is used)

In such a “system access” interface it might be useful to allow user to e.g. forbid certain origins from gaining access to certain capabilities, even if they were delegated.

The underlying issue is that once created, a leaked or stolen token can be reused. Again, this is part of the UCAN design, and it's useful in some use cases like ticketing, but seems problematic when granting permissions.

I think this might be misleading. Leaked and stolen UCANs can not be exercised without also stealing private keys associated with them. For that reason most web uses of UCANs go with non extractable private keys so they are tied to origin and can not be leaked.

If you mean something like a smuggling of UCANS where one site might intentionally smuggle UCAN along with the keypair, that is certainly possible, but in that case sites are conspiring together and both are malicious.

A possible mitigation is to limit the lifetime of the UCAN, by constraining the expiration time in the permission request dialog. This is a double edge sword though, since a short lifetime will lead to repeated prompts (which in turn lead to users paying less attention to what is being presented). Also, the use case of a user changing their mind or making a mistake needs to be accounted for, as well as the "grant for this session only" case.

I think it is worth considering short lived UCANs that could be renewed without user prompts within certain timeframe. Something along the lines what browsers are already doing with storage of sites you visit only indirectly. E.g. if user is not actively using a site it’s probably better prompt user on UCAN renewal, if user is actively using site to which they’ve granted capabilities, it’s probably better to renew capabilities without interfering into user flow, especially if user can always revoke sites capabilities.

The solution being implemented in Capyloon is to implement a device-local blocking mechanism, keyed on the token signature. That requires DIDs and UCANs to be synchonized among devices in multi-devices scenario.

I think DIDs private key should never leave the device as it creates opportunity for keys to leak, furthermore it would make stolen / lost device a real hazard. Each device having own keys seems better, although it creates UX challenge when designing interface for revoking apps / sites capability across all devices.

Hi @Gozala ! Thanks for the comments.

I’m bit confused here is each DID rough equivalent of a user in unix system ?

Somewhat... The idea being you could create for instance different DIDs to use in different contexts on the same device - maybe personal vs. pro or even create a "guest" DID for temporary usage.

About core apps having always-on full-access, I tend to agree with you that something more granular would be better. I'll see if I can implement that by declaring requested capabilities in the app manifest. There are situations where you really don't want to prompt the user, like accessing the wallpaper, etc. so that needs to work at first run with implicit trust by default.

I would argue this may not be that different from what you can do on the web platform or specifically from site_b.com embedding site_a.com into hidden iframe and using message passing to invoke capabilities.

Yes, it's not much worse than the current web model you describe, but we should aim at being better :)

I think what might help empower user is capturing all the UCANs in rotation in some “system access” interface so user is always able to revoke access. This way user participation can be more indirect, instead of prompting every time cross origin delegation occurs user can be shown a more passive status indicator that they can act upon (Apple does that sort of a think when geolocation, microphone, webcam etc… is used)

In such a “system access” interface it might be useful to allow user to e.g. forbid certain origins from gaining access to certain capabilities, even if they were delegated.

Yes, I will build a management UI - I'm just a bit worried about information overload for users.

If you mean something like a smuggling of UCANS where one site might intentionally smuggle UCAN along with the keypair, that is certainly possible, but in that case sites are conspiring together and both are malicious.

It can also come from a breach, with no malicious intent from the site that was trusted in the first place. I don't think the attacker needs to steal keys to use the UCAN afterward, right? Of course they still need to trick the user to open the attacker site.

I think it is worth considering short lived UCANs that could be renewed without user prompts within certain timeframe. Something along the lines what browsers are already doing with storage of sites you visit only indirectly. E.g. if user is not actively using a site it’s probably better prompt user on UCAN renewal, if user is actively using site to which they’ve granted capabilities, it’s probably better to renew capabilities without interfering into user flow, especially if user can always revoke sites capabilities.

Excellent idea, automatic renewal makes a lot of sense when there is repeated usage! The fact that UCANs have baked in support for validity timeframes makes that very clean also.

I think DIDs private key should never leave the device as it creates opportunity for keys to leak, furthermore it would make stolen / lost device a real hazard. Each device having own keys seems better, although it creates UX challenge when designing interface for revoking apps / sites capability across all devices.

I also would rather keep all private keys in non-extractable secure storage. As you note there are tradeoffs between threat model and UX here. I'm also looking at using hardware keystores such as Ledger devices, and for multi-device use to keep keys on a single device and establish a secure channel with the secondary ones. No implementation yet :)

Yes, it's not much worse than the current web model you describe, but we should aim at being better :)

Fair enough. I think it might be worth defining the better version. I assume it's better user agency which in turn may imply more transparency about things happening.

Apologies for taking this off the topic: For what it's worth I came to conclusion that it may be better to cut apps ability to network directly and instead allow them to create and edit networked data which user can decide who is it shared with, where it's been stored etc... Delta chat seems to be experimenting with something along these lines https://delta.chat/en/2022-06-14-webxdcintro where web app is loaded into message thread context and is able to read / write messages into that specific thread but can not smuggle those messages into own private server.

It can also come from a breach, with no malicious intent from the site that was trusted in the first place. I don't think the attacker needs to steal keys to use the UCAN afterward, right? Of course they still need to trick the user to open the attacker sit

I suppose it depends on the breach. If UCAN token of site A was stolen by site B, it would not be able to exercise it as invocation UCAN would need to be signed by A's private key otherwise it will not be valid. If somehow site B tricked site A into delegating capabilities to a DID for which B owns private key, then yeah B will be able to invoke delegated capabilities until A (or whoever delegated capabilities to A) revokes them.

It is also worth noting that transferring UCANs to a A server (which could be breached) is (arguably) not the intended use of UCANs. UCANs and DID private keys are meant to stay local on the user device. New / another user device can be delegated capabilities from the device that has them. Usually this means server acts as dumb pipe that can route request from new device with new DID to the one with capabilities and then routing UCAN delegating capabilities to new device back. No keys leave devices so server itself gains no capabilities.

I've also prototyped QR code capability sharing across devices so exchange could occur in networkless conditions. Happy to share more details if that seems relevant / interesting.

I'm also looking at using hardware keystores such as Ledger devices, and for multi-device use to keep keys on a single device and establish a secure channel with the secondary ones. No implementation yet :)

I still feel like even in presence of hardware keystore it's still better to let each app on each device have own keys. Hardware key could be used to delegate capabilities and / or revoke them. UX can still be same, yet such design would leaves convenient trace of what operations had occurred on which device / app.

If UCAN token of site A was stolen by site B, it would not be able to exercise it as invocation UCAN would need to be signed by A's private key otherwise it will not be valid.

That's the part I don't understand. It's likely that the flow implemented in Capyloon is not correct. I blame the spec! 🤣

What we have is:

one or several local DIDs exist to represent the device's user.
some on-device API (here the VFS one) relies on UCANs as a permission/access control mechanism.
a 3rd party web page that needs to be able to use this API: siteA.com

Current flow:

siteA.com has its own DID, pass it to the requestCapabilities api that returns an encoded UCAN. The site DID is used as the audience, and one the device-local user DID as the issuer.
siteA.com tells the VFS api "here's my token". The API validates the token (signature and time range check) and ensure that the issuer is a locally known one. We also check that this token was not marked as blocked based on its signature.
The VFS api allows/disallows features based on the capabilities that were granted.

Obviously something is not matching your description, because in this implementation nothing prevents siteB.com to use the token created for siteA.com in 2.

I blame the spec! 🤣

Yeah spec had been fairly confusing, but it's getting better as we iterate! I'd invite you to participate, your input would be greatly appreciated. Most discussions occur either in the repo itself https://github.com/ucan-wg/spec/ and some higher bandwidth ones on discord https://discord.com/channels/478735028319158273/891351883388715079

one or several local DIDs exist to represent the device's user.

I would assume that local user DID is logical "resource owner" and can execute the capabilities it delegates. In other words those are "provider" DIDs.

some on-device API (here the VFS one) relies on UCANs as a permission/access control mechanism.

You can think of device APIs as "provider" implementation detail. On server / client architecture this and the one above are typically the same. Meaning service provider has a DID and it is used to delegate capabilities and then execute them once they are invoked.

siteA.com has its own DID, pass it to the requestCapabilities api that returns an encoded UCAN. The site DID is used as the audience, and one the device-local user DID as the issuer.

👍

siteA.com tells the VFS api "here's my token". The API validates the token (signature and time range check) and ensure that the issuer is a locally known one. We also check that this token was not marked as blocked based on its signature.

I think this is where misunderstanding is happening. For simplicity let's say requestCapabilities API returned UCAN where issuer is did:key:zAlice and audience is did:key:zAliceSiteA.

siteA.com should not just pass that UCAN into VFS API. Instead it should create an new delegated UCAN (spec refers to this as "invocation") from the UCAN was given to it by requestCapabilities. In this delegated UCAN issuer will be did:key:zAliceSiteA and audience will be did:key:zAlice, effectively creating a loop.

This way service provider can verify that invocation actually comes from did:key:zAlice (or whoever Alice delegated capabilites) because it will be signed by did:key:zAlice. It is also good idea for that invocation UCAN to only delegate a capability / capabilities required to execute it and nothing more (e.g. only file/read and with specific file URI). This allows zAlice to delegate specific read to third party who would only be able to read specific file and nothing else.

The VFS api allows/disallows features based on the capabilities that were granted.

So only missing piece here is that VFS should check that audience is one of the active device users. If it is anything else it should deny service. This way UCANs no longer need to be secret (well you may still don't want to publicize what capabilities you have, but you get the idea) as invoking them requires private key of the DID. And if private keys are non exportable attacker would need to get hold of the actual user device.

Ok, I see... So most of the changes need to actually happen on the demo web site. Do you know of a JS lib to do the DID / UCAN invocation? So far I was just using a hardcoded DID :)

Ok, I see... So most of the changes need to actually happen on the demo web site. Do you know of a JS lib to do the DID / UCAN invocation? So far I was just using a hardcoded DID :)

We wrote this one https://github.com/web3-storage/ucanto/ and I'm bit biased toward it. It takes care of encoding UCAN chains into encoding agnostic format JWT or IPLD and has some opinionated interface for defining services in form of capabilities (or capability parsers to be more accurate) and corresponding handlers.

There is also one from Fission team. This one is JWT only and and has no specific stuff for invocations, but given invocation is just an outmost delegation in UCAN chain it is usually fine. https://github.com/ucan-wg/ts-ucan

Oh it might be worth pointing out there is pretty significant change coming in 0.9 spec which replaces inline proofs with CID addressing to enable encoding agnostic representation of UCANs and allow omitting proofs without affecting signatures. Mentioned ucanto is build around that and mostly an RPC layer on top of https://github.com/ipld/js-dag-ucan.

capyloon / nutria

Questions regarding UCAN findings #34