filecoin-project / boost

Boost is a tool for Filecoin storage providers to manage data storage and retrievals on Filecoin.
Other
111 stars 67 forks source link

Configurable feature to restrict retrievals of certain CIDs to certain wallets #1626

Open andrewferrone opened 1 year ago

andrewferrone commented 1 year ago

Checklist

Boost component

What is the motivation behind this feature request? Is your feature request related to a problem? Please describe.

As an enterprise client I want to restrict retrievals from the SPs I work with such that only I can retrieve my CIDs from an SP, even if the content of those CIDs is encrypted. This provides an additional layer of security where even if modern encryption algorithms are broken by advancements in computing or some other security breach, the client's data would not be accessible.

High-level client journeys include:

High-level SP journeys include:

Describe the solution you'd like

I'm open to input here, but one idea is to ...

Require the client to send a list of "deal-making wallets:" and "retrieval wallets" to the SPs they want to make deals with.

Provide each SP a way in Boost to create a "wallet-restricted retrievals" rule including the client-provided list of "deal-making wallets" and "retrieval wallets."

Once configured, retrievals for any CID made by any of the wallets in a "wallet-restricted retrievals" rule would be restricted across all protocols to only the wallets in the list of "retrieval wallets" for that rule.

Describe alternatives you've considered

We've considered CID Gravity and JWT implementations. In the future we expect clients to require proper access controls with roles, users, and entitlements mapped to CIDs or containers/buckets of CIDs.

Additional context

This feature is required by several enterprise clients.

RobQuistNL commented 1 year ago

I think terminology is a bit off here - "retrieving wallets" won't work when we're looking at HTTP for example.

Being able to identify and restrict access based on that identity however is something that would be a great addition.

brendalee commented 1 year ago

one question - you mention wallets as a means of access control, do you have payments in mind as well? or this specific case is just around leveraging wallets for identity?

One way to do easy access control right now, in front of booster-http, is to setup nginx and have user/password type access restrictions (some basic details can be found in this blog post: https://filecoin.io/blog/posts/protecting-booster-http-with-nginx/) .

RobQuistNL commented 1 year ago

@brendalee that only works for generic access.. which is not the issue. What we want is specific ACL's for specific CID's. Also the CIDs that are contained within a CID.

When user X stores carfile Y with an SP, only user X should be able to retrieve the data from carfile Y (be it the entire file, or a nested CID buried deep within the graph)

Then when user Z stores carfile A, he should be able to retrieve all from carfile A, but not from carfile Y.

andrewferrone commented 1 year ago

one question - you mention wallets as a means of access control, do you have payments in mind as well? or this specific case is just around leveraging wallets for identity?

One way to do easy access control right now, in front of booster-http, is to setup nginx and have user/password type access restrictions (some basic details can be found in this blog post: https://filecoin.io/blog/posts/protecting-booster-http-with-nginx/) .

@brendalee the intent is simply to use wallets as identity but not a hard requirement - open to using some other form of identitiy.

RobQuistNL commented 1 year ago

If the "retrievalFilter" would 1) always be called on a retrieval (be it graphsync, http or bitswap) 2) be able to send along some form of identification (either checked or not) 3) send along the dealID / PieceCID of the requested CID

we could solve this ourselves easily.

alvin-reyes commented 1 year ago

We can create a wallet manager with ACLs to impose permissions/restrictions on certain function - which in this case "retrieval" and "deal-making".

The ACL (middleware) will then be called to check wallet permissions before making any actions.

LexLuthr commented 1 year ago

Tying it directly to the wallets is simply not a good idea. This puts too many restrictions on the use cases. We also need to take into account that public data set should have no check.

In my opinion, we should solve this problem on L2. if we want to apply this solution to all 3 protocols then we probably will need to make changes to the protocols themselves to allow sending additional bytes to check for ACL. Headers can be used for HTTP but Bitswap and Graphsync will probably need changes.

In addition to above, we need to allow ACLs to be modified by the client without any interaction with Boost. Boost should not hold any data related to ACLs.

alvin-reyes commented 1 year ago

I believe most SPs are using different wallets for different datasets i.e SPs are pushing large datasets to the network and is running multiple "deal making" engine with different signing wallets for categorization. I have not seen a use case / SP where they want to do the same separation for retrievals but I do see why this is a good feature to consider. Creating an ACL will enable audit trail which is an essential of Enterprise businesses.

You are right that this should be solve on the L2 though. We shouldn't change the protocol for this and let L2 manage.