fission-codes / fission

Fission CLI & server
https://runfission.com/docs
119 stars 14 forks source link

RESTful append protocol and server #577

Closed jeffgca closed 2 years ago

jeffgca commented 2 years ago

RESTful (?) APPEND service spec and implementation.

tl;dr we need a generic REST ( / whatever ) protocol and server implementation that can be run that relays client data from mobile browsers ( especially ) and other edge / lightweight clients

1/ Who will benefit from this idea?

App developers and users using webnative-enabled apps.

2/ What is the problem or opportunity?

Running JS-IPFS in the browser is a resource-hungry path to getting content in IPFS, limiting the utility of apps on mobile where we anticipate a high cost in battery life would be paid when running the app on a mobile device ( or laptop on battery )

3/ What is the most important benefit to people?

The ability to still use WNFS and webnative, but not require a local ipfs node to be running in order to add data to IPFS.

The ability to validate data to be uploaded against the permissions and capabilities of the client and user, eg checking data upload quotas, ensuring the data is not too large an amount to be uploaded all at once, perhaps even fancy data splitting / recombining / validated in the case of a large media file being uploaded.

4/ How do you know what people want?

Battery life is table-stakes, s-tier for any new mobile technology. An append endpoint will enable a bridge from the mobile web to IPFS that developer require without forcing them to deal with the downsides of running the full IPFS stack - very much an echo of blockchain use of “lite” clients that enable lightweight clients without needing local blockchain data.

https://www.statista.com/chart/563/improvements-wanted-by-mobile-device-users/

This chart shows which improvements are the most wanted by mobile device users across five countries.

5/ What does the experience look like?

When an app is published by a developer, that app will have a top-level UCAN that all app accounts derive their own UCANs from. This top-level UCAN may provide any derived user UCANs created that ability to store data in IPFS infrastructure, either hosted by Fission, PL, or some other entity.

When a new user creates a new account on an app, their account and related data / metadata need to be saved locally using webnative and then the resulting encrypted bits need to be transferred to IPFS infra and “pinned”.

Today, the webnative client starts up a local js-ipfs node to accomplish this. With this new endpoint, the local webnative data can instead be packaged as a CAR file ( or perhaps some more appropriate format ) on the client and then transferred to IPFS infra.

Mobile apps and browsers can be severely impacted by long-lived and radio-heavy connection protocols such as IPFS and libp2p. While we should advocate for performance and reliability improvements generally with libp2p, a simpler and more immediate solution is to implement a lightweight service that can data from mobile devices.

From @boris:

[12:37 PM] @JeffG the specifics of IPFS as I listed in the Talk page is that POTENTIALLY, a “post” of a file to the endpoint, just has a basic rule that “appends” that file into WNFS
[12:37 PM] As opposed to our current “publish” endpoint, which blows away whatever was there before and overwrites it
[12:38 PM] The mechanics of IPFS is such that one could ipfs add → get a hash → IPFS dag-put “into” a WNFS file path
[12:38 PM] e.g. you could even have “static site deployed here” / “assets”
[12:38 PM] And the app is configured so that POST’d files get appended into the assets sub folder
[12:39 PM] There is some major handwaving here, and you still update the root hash / dnslink, but you don’t need to blow away the entire root just to add an uploaded file

Related talk post: https://talk.fission.codes/t/webnative-append-endpoint/2306

matheus23 commented 2 years ago

What would the endpoint append to? I see two options:

  1. Fission apps, so if you've published something at https://your.fission.app
  2. Fission filesystems. So you're directly attaching something to your WNFS using the endpoint. It'll show up in your drive.

Depending on which version we're talking about, it would be accomplished fastest either by sharing code with webnative, or sharing code with the fission monorepo (here).

Webnative is the only thing that understands WNFS right now, which would be needed for (2). Then it's easier to get webnative running in nodejs (similar to what I've done with wnfs-migration) than to rewrite WNFS in haskell. If we're going for (1), then it's fairly easy to do with ipfs out of the box, so we can totally do this in haskell-land, where we can call go-ipfs.

jeffgca commented 2 years ago

All good questions. Again, this issue was created from a draft card in the project but we have some info on talk about this idea that I'll post into the description.

jeffgca commented 2 years ago

I've added the 5Q from the talk post, and as well made the talk post public.

jeffgca commented 2 years ago

What would the endpoint append to? I see two options:

1. Fission apps, so if you've published something at `https://your.fission.app`

For some value of 'fission app', yes. The specific use case is a mobile web or native mobile app that needs a simple way to add content to ipfs. This content can be raw ipfs data eg public wnfs or web3.storage, or it could be encrypted wnfs data. While I think it's okay to ship with the former initially if that's easier, we must design so the latter is not just possible, but next.

2. Fission filesystems. So you're directly attaching something to your WNFS using the endpoint. It'll show up in your drive.

This is a non-goal, unless the user is using drive or related 'fission-classic' apps published this way. The goal is to support AOL-style apps, but I assume drive can be supported as a side-effect as long as it works in the same way as an AOL app.

Depending on which version we're talking about, it would be accomplished fastest either by sharing code with webnative, or sharing code with the fission monorepo (here).

@expede 's first impulse was the former, I'll defer to her.

matheus23 commented 2 years ago

This content can be raw ipfs data eg public wnfs or web3.storage, or it could be encrypted wnfs data. While I think it's okay to ship with the former initially if that's easier, we must design so the latter is not just possible, but next.

The goal is to support AOL-style apps

Yeah, to me this is saying the append endpoint must understand WNFS.

expede commented 2 years ago

cc @bgins pointers to some of the existing functionality:

Here's the endpoint for upload (e.g. Heroku) today:

At the point where it has this...

runDBNow $ LoosePin.createMany userId [newCID]

...the file is on go-ipfs, so you'd use the IPFS HTTP API to build the DAG with /dag/put. When we did this kind of DAG surgery last time (manual .ipfsignore) we used the object interface, which is deprecated, but that (old removed code) is here. Most of that is about helping us walk the tree to remove stuff and add things back in, so in theory this append endpoint should actually be simpler than this since we're not trying to drop files based on a regex.

expede commented 2 years ago

Oh the other thing to note in the above code is that it uses the in-process IPFS rather than a remote node. The available API is the same, but would now be over HTTP instead of local

bgins commented 2 years ago

The design for this feature is in this post: https://talk.fission.codes/t/webnative-append-endpoint/2306/4. Summarizing it here:


The append endpoint will provide a dedicated upload folder for each app in WNFS at the path public/apps/<creator>/<app-name>/uploads/.

The initial version will only support public files. We can look at adding private files later.

The append endpoint will look something like:

POST runfission.com/v3/web2compat/:username/:creator/:app-name/:filename

The path segments are:

The server endpoints are currently at v2, but we will have upgraded them to v3 by the time we complete this feature.

If a user is new and does not have a WNFS, the server will create an empty filesystem when they first upload a file.

This API does not guarantee dedicated storage for apps. It's basically "honor system" that developers will only upload files as themselves as the creator, but we won't enforce this behavior.

Developers can GET files with existing endpoints already supported by HTTP gateway and the server.

expede commented 2 years ago

@bgins might I suggest not bumping the API version? There is no breaking change here, so we can add it to 2.x

@therealjeffg will have more opinions on the developer-facing product portion of this. We discussed web2compat in the thread, but I wonder if making that less noisy (calling people out less for being in web 2.0 mode) could be friendlier? runfission.com/v2/append/:username/:creator/:app-name/:filename

expede commented 2 years ago

Depending on how deep we want to go, we could also give them something like:

runfission.com/v2/append?path=arbitrary_path_here
bgins commented 2 years ago

@bgins might I suggest not bumping the API version? There is no breaking change here, so we can add it to 2.x

Yeah, for sure! Let's keep this in v2.

@therealjeffg will have more opinions on the developer-facing product portion of this. We discussed web2compat in the thread, but I wonder if making that less noisy (calling people out less for being in web 2.0 mode) could be friendlier? runfission.com/v2/append/:username/:creator/:app-name/:filename

append instead fo web2compat makes sense to me. Aligns it with the name of the feature, which feels more intuitive 👍

Depending on how deep we want to go, we could also give them something like:

runfission.com/v2/append?path=arbitrary_path_here

I'm in favor of the explicit path segments because it makes it clear that each are required. Less chance for mistakes or misunderstandings about which paths the API provides.

bgins commented 2 years ago

@icidasset and I were discussing this feature this morning. My impression is the scope of this feature has diverged from the initial write up.

The ability to validate data to be uploaded against the permissions and capabilities of the client and user, eg checking data upload quotas, ensuring the data is not too large an amount to be uploaded all at once, perhaps even fancy data splitting / recombining / validated in the case of a large media file being uploaded.

Are we concerned with checking quotas in this version of the append endpoint? Quotas would require:

When an app is published by a developer, that app will have a top-level UCAN that all app accounts derive their own UCANs from. This top-level UCAN may provide any derived user UCANs created that ability to store data in IPFS infrastructure, either hosted by Fission, PL, or some other entity.

The idea is that apps have some quota and they divide it up amongst their users. The scope for an authorization flow like this goes beyond the simplest version of an append endpoint. In the simplest version, we use existing authorization flows:

With this new endpoint, the local webnative data can instead be packaged as a CAR file ( or perhaps some more appropriate format ) on the client and then transferred to IPFS infra.

Are we doing CAR files in this version? I think we discussed a simpler version where you just upload a file to a directory. The file gets added there or replaces an existing file with the same name.

bmann commented 2 years ago

No to quotas -- we don't have quotas implemented at all at this point.

And also, no per user quota stuff.

Some of these things MIGHT be useful if we can somehow "forward" these things to NFT Storage as well, but that's a future feature.

Yes to uploading files. No to using CAR files. This is a restful endpoint that web2 apps don't know anything about IPFS on the upload part.

Let's use the Discourse server example as one of the known use cases.

The mode I had thought of is that the Discourse server is an "app". We can, for now, use the same pattern as Github Actions Publish. The developer can use the CLI to create a new local app, and then use their (master) UCAN key to store in an ENV var / secret within Discourse settings.

I do think "create a delegated UCAN key for this app" is a backlog item for both Github Actions and for this append endpoint.

bgins commented 2 years ago

No to quotas -- we don't have quotas implemented at all at this point.

And also, no per user quota stuff.

Yes to uploading files. No to using CAR files. This is a restful endpoint that web2 apps don't know anything about IPFS on the upload part.

Alright, thanks. That clarifies the scope for this initial version. 🙏

The mode I had thought of is that the Discourse server is an "app". We can, for now, use the same pattern as Github Actions Publish. The developer can use the CLI to create a new local app, and then use their (master) UCAN key to store in an ENV var / secret within Discourse settings.

Yep, this minimal version should support that use case. The Discourse server would POST to the append endpoint to store data. It would authorize using the root key pair created by the CLI. Makes sense to me. 👍

icidasset commented 2 years ago

When we did this kind of DAG surgery last time (manual .ipfsignore) we used the object interface, which is deprecated, but that (old removed code) is here.

@expede Since we haven't got the code that creates a PublicFile skeleton, metadata object, previous link and userland structure + the pretty tree variant. Do you think it's a good idea to start using the WASM code from rs-wnfs here, so we don't need to reimplement a large part of that in Haskell?

expede commented 2 years ago

@matheus23 / @appcypher Is the rs-wnfs work at the point where it could do what @icidasset is suggesting without pulling in a bunch of historical data? That would be ideal, even if the effort is similar.

expede commented 2 years ago

@bgins

Are we doing CAR files in this version?

Just echoing Boris here, but also adding some context. The use case is a random web 2.0 app wandering by and wanting to upload with something like the JS fetch API, or the Swift standard lib. We have the existing code to get the file to the server as an HTTP multipart-form, or (more modern) octet stream. It's meant to be the easiest way to get started with Fission without adjusting your app at all. CAR Pool will eventually replace this, but will require wrapping the Rust/Wasm code in helpers on a per-language basis to make it a nice experience (lots of code reuse, usable anywhere there's Wasm/Wasi, but want to hide that fact from e.g. Swift).

expede commented 2 years ago

The Discourse server would POST

Semantically, should we make it a PUT?

appcypher commented 2 years ago

@matheus23 / @appcypher Is the rs-wnfs work at the point where it could do what @icidasset is suggesting without pulling in a bunch of historical data? That would be ideal, even if the effort is similar.

PublicFile skeleton, metadata object, previous link, userland. General public filesystem stuff. Yes. pretty tree. No yet. @matheus23 and I discussed the pretty tree implementation a while back and realised the ts version generates it in a way we cannot cleanly map to Rust yet so we agreed to put it on hold. One short term issue we might have is the wasm size which is ~2MiB at the moment. There are several ways of fixing that, I just haven't gotten around to it yet.

bgins commented 2 years ago

Semantically, should we make it a PUT?

Yep, good point! We are targeting a single resource in a collection, so PUT does make more sense than POST. 👍

expede commented 2 years ago

One short term issue we might have is the wasm size which is ~2MiB at the moment. There are several ways of fixing that, I just haven't gotten around to it yet.

That's certainly something that we'll want to optimize, but we're more interested in functionality than size at the moment (unless it's a quick fix)

expede commented 2 years ago

PublicFile skeleton, metadata object, previous link, userland. General public filesystem stuff. Yes. pretty tree. No yet.

@icidasset Looks like it's not ready for production use yet. It also probably doesn't know how to attach the private tree in place, since the strategy has been to embed the Wasm into the TS codebase and let it handle the existing functionality (incremental refactor). It sounds like we should do it in Haskell for the time being.

matheus23 commented 2 years ago

@expede I would propose a hybrid variant of haskell/rust. Specifically, I'd propose that this happens on a PUT of that request:

  1. The haskell server pipes the file stream into an IPFS node, by using the IPFS node's MFS on the pretty tree of the user.
  2. We figure out what the file's CID is we just added to the pretty tree.
  3. We run a write operation using rs-wnfs, connected to the IPFS node via a BlockStore and give the operation the file's CID. This will give us back a new CID for the public tree.
  4. We use the MFS API from IPFS to update the public tree to the new CID we got from rs-wnfs.

This is essentially running two operations for each upload: An MFS operation provided by IPFS for the pretty tree and a WNFS operation provided by rs-wnfs.

MFS and rs-wnfs seem pretty interchangeable API-wise (but of course MFS only writes basic IPFS UnixFS).


I think the biggest efforts for both alternative approaches become "do we want to reimplement the DAG surgery for public WNFS in haskell?" vs. "do we want to implement yet another webservice, this time in rust, on top of rs-wnfs?".

expede commented 2 years ago

@matheus23 A couple quick questions:

Pipes

Can we add it by CID to rs-wnfs? Ideally we don't keep any files in the server's go-ipfs node, and only use the remote IPFS "mega node".

Blockstore

Does rs-ipfs have an IPFS/IPLD blockstore today, or are you essentially suggesting that we do all of the actual IPFS operations in Haskell?

"do we want to implement yet another webservice, this time in rust, on top of rs-wnfs?"

I'm not sold on this strategy, but in theory a thin web interface on rs-wnfs should be pretty lightweight, and would run locally to the server only. The other option is FFI, I have no strong preference. If we were to use the Rust code, we need to get data across the system boundary in either case. We don't need access control or anything since it's calling from a trusted IP, and can't be dialed back into. It just one option for such an interface.

In an ideal world, we manage to implement this feature with as few hops across this boundary as possible, ideally without passing around a bunch of serialized data (just CIDs), and let the services talk to IPFS separately.

There are existing Rust libraries for working with the IPFS HTTP API directly (but I don't know how good they are).


My underlying opinion is: if we're building trees in Haskell (e.g. pretty tree), then let's build some trees in Haskell for this feature (directly at the IPFS layer). These are at special-cased paths, so we may not need the full generality. If we can do it in Rust entirely, then that's best since we only have to update things in one codebase over time, rather than switching this out to Rust at some point down the road.

matheus23 commented 2 years ago

Can we add it by CID to rs-wnfs? Ideally we don't keep any files in the server's go-ipfs node, and only use the remote IPFS "mega node".

Yes. The interface for... essentially "setting the content of a file at a path" is taking a CID as an argument in rs-wnfs today.

Does rs-ipfs have an IPFS/IPLD blockstore today, or are you essentially suggesting that we do all of the actual IPFS operations in Haskell?

I'll assume you meant to write "rs-wnfs" not "rs-ipfs". No, we don't have an IPFS blockstore implementation today, but it should be super easy to write a blockstore that just connects to another go-ipfs node over its HTTP API.

I'm not sold on this strategy, but in theory a thin web interface on rs-wnfs should be pretty lightweight, and would run locally to the server only. The other option is FFI, I have no strong preference. If we were to use the Rust code, we need to get data across the system boundary in either case. We don't need access control or anything since it's calling from a trusted IP, and can't be dialed back into. It just one option for such an interface.

In an ideal world, we manage to implement this feature with as few hops across this boundary as possible, ideally without passing around a bunch of serialized data (just CIDs), and let the services talk to IPFS separately.

Yeah. CIDs across boundaries sounds like exactly the picture I have in mind. I was thinking "internal webservice" in the text above (i.e. unexposed webservice without auth), but I don't think it actually matters. Maybe we need to experiment there to figure out exactly what would be better. A internal webservice model and FFI would both work.

My underlying opinion is: if we're building trees in Haskell (e.g. pretty tree), then let's build some trees in Haskell for this feature (directly at the IPFS layer). These are at special-cased paths, so we may not need the full generality. If we can do it in Rust entirely, then that's best since we only have to update things in one codebase over time, rather than switching this out to Rust at some point down the road.

I think we don't actually need to keep any trees in haskell at all. We should be able to just do most of the pretty tree surgery using the MFS APIs from IPFS. Maybe my concept of what the MFS stuff can do for us is not quite correct, we'd need to look at them more closely.

matheus23 commented 2 years ago

I think we don't actually need to keep any trees in haskell at all. We should be able to just do most of the pretty tree surgery using the MFS APIs from IPFS. Maybe my concept of what the MFS stuff can do for us is not quite correct, we'd need to look at them more closely.

So, I just had a closer look at the MFS API again & talked to @Jorropo from PL on the ipfs discord.

This is how you could upload a file to the pretty tree using the CLI for MFS:

$ ipfs files cp /ipfs/<user-data-root>/ /some-user-data-root
$ ipfs files write --parents --truncate --create --raw-leaves --cid-version=1 /p/some-user-data-root/<path> <data>
$ ipfs files stat /some-user-data-root/
<new-user-data-root>
[more info]

This is all based on CIDs and path names. It works completely lazily and doesn't load any data it doesn't have to load.

The exact same API is also available over HTTP from our ipfs node within our server. Jorropo also hinted at modifications to different subdirectories in MFS being parallelized, which would be something we need.

We could use the same mechanism to do the UnixFS DAG surgery necessary to add back the new CID that rs-wnfs generated for an updated public WNFS tree.

expede commented 2 years ago

I think we don't actually need to keep any trees in haskell at all.

Sorry, let me rephrase: if we're building trees via Haskell calling the go-ipfs HTTP APIs rather than having Rust interact with IPFS.

expede commented 2 years ago

This is how you could upload a file to the pretty tree using the CLI for MFS:

Awesome; that sounds like we could skip the Rust stuff completely then, yes? Since this is essentially a hack while we wait for more robust systems, would the simplest path be to use MFS to add the file to the public tree (ironically immutably, including intermediate nodes), and the pretty tree?

Yes, we absolutely want a blockstore for rs-wnfs at some stage, but a robust blockstore for a remote node doesn't sound like it's on the critical path.

expede commented 2 years ago

Had a quick call with @matheus23 to talk through the above. He's going to write up what we talked about in more fidelity tomorrow CEST, but the TL;DR is:

matheus23 commented 2 years ago

Brooke and I talked about this a little bit on a call. I'm actually convinced we want to do this in haskell now. The biggest reason that convinces me now is that rs-wnfs just doesn't support the old WNFS v1, and since public WNFS v1 is still built on/encoded as UnixFS, it makes a lot of sense to modify it using MFS.


He's going to write up what we talked about in more fidelity tomorrow CEST

Actually I'm doing this right now! :stuck_out_tongue:


I have a pretty clear picture of how we could build this now, I think.

The ingredients we'll need are

  1. The ability to call MFS commands on the IPFS node connected to our web-api server.
  2. A metadata parser for our public WNFS metadata CBOR (these files: https://matheus23.files.fission.name/public/metadata)

The to biggest things we need to do are:

  1. updating all previous links along the path that we're adding by essentially running ipfs files cp /some/path /some/path/previous (or if that doesn't work a combination of ipfs files stat /some/path and ipfs files cp /ipfs/<cid> /some/path/previous)
  2. updating all metadatas along the path.
  3. uploading the file to the path
expede commented 2 years ago

I don't think that the haskell-ipfs library is up to date with the MFS files API, so we'll want to add the couple of commands via Servant in that library (e.g. chcid)

bgins commented 2 years ago

We've updated the plan for this feature. Files will be appended to storage on a per-app basis instead of storing them in a user's WNFS. We can simplify the API endpoint to POST runfission.com/v2/api/append/:app-name/:filename.

Along with the append endpoint, we will implement a couple of new CLI commands that:

icidasset commented 2 years ago

Full DAG surgery flow for apps (tested with CLI):

  1. ipfs files cp --parents /ipfs/bafybeiebdfqyaqwxfcmmsuq6sc3yjg4ejyzsubk2xvob34ylqvpxjcphma /diffuse-nightly/ Note: Slash at the end of the destination path, otherwise it'll copy the files into diffuse-nightly/ We want the CID in the MFS path to avoid conflicts.
  2. echo "Hello" | ipfs files write --parents --truncate --create --raw-leaves --cid-version=1 /diffuse-nightly/bafybeiebdfqyaqwxfcmmsuq6sc3yjg4ejyzsubk2xvob34ylqvpxjcphma/test.txt
  3. Take CID from ipfs files stat /diffuse-nightly/bafybeiebdfqyaqwxfcmmsuq6sc3yjg4ejyzsubk2xvob34ylqvpxjcphma/
  4. ipfs pin add CID
  5. Clear out MFS dir for next usage: ipfs files rm -r /diffuse-nightly/bafybeiebdfqyaqwxfcmmsuq6sc3yjg4ejyzsubk2xvob34ylqvpxjcphma/