What would the Shared Data Layer look like without WebDB?

webdesserts commented 6 years ago

Why I'm Uncomfortable with WebDB

In the WebDB proposal, you are laying out some recommended "social primitives" built on top of the shared data layer proposal. As mentioned on twitter, I'm not completely comfortable with WebDB yet. Based on our conversation, I think you eventually want to provide developers with a way to share their own WebDB-like user data-types. If WebDB is just a stopgap for the more generic solution and it's API will eventually be "explained" I can kind of understand moving forward with some version of WebDB, however I think it produces some knee-jerk reactions for some developers like me. Some of the reasons I think I'm having these knee-jerk reactions are:

I'm not used to a browser having strong opinions about the contents of my app

"likes" vs. "reactions"
Even with enforced data-types, I feel like there will still be compatibility issues

I want bold and italic in posts, did we all agree on saving our posts as html, markdown, rotondeml, or something else?

or...

I don't want bold and italic in my app, am I going to have to filter out everybody's html now?
I don't want new Beaker developers to give up on Beaker as a whole because they felt pressured to build their app a specific way or else suffer a loss to power/usability.

I think the last point is the main reason I'm knee-jerking about this. If WebDB is just a way for you to "buy time" until you can put together a better solution I think I'd argue that even without these proposed APIs Beaker is already super cool and super powerful. Developers are already building interesting things. There's still plenty to explore in the currently landscape.

The Shared Data Layer without WebDB

But I'm also having a hard time understanding what the Shared Data Layer would offer without WebDB. In the proposal, most of the examples also mention or include WebDB. So I had a few questions about some things that I felt the proposal didn't completely clarify:

In the following example, what is the service?
```
var session = await navigator.sessions.request(service, {permissions?:})
```
Later on you use the string 'webdb'. Is that a browser defined token just to get a "WebDB" session (whatever that might mean)? Is it just a user defined string? If so how does that work? Do we have to worry about naming conflicts?
Other than pass a ServiceSession into a WebDB constructor what else can I do with it?
On twitter you mentioned:

apps can still ask for full write access to a user dat and then create their own indexes

What would that look like? Would this be done through the session api or would a developer need to use something like DatArchive.selectArchive()? Are user dats exposed to DatArchive.selectArchive() like a normal archive (that'd be a little weird but... idk)?

In general I'm just trying to figure out if there's any use for the session API without WebDB or if it relies on its existence to be useful.

pfrazee commented 6 years ago

Hey Michael, thanks again for the detailed thoughts. We've been discussing this in detail all weekend. Suffice to say, it's not an easy decision.

I think you're right that a builtin WebDB can create a perception problem. We knew that some people would dislike the lack of control in the pre-defined schemas, but we assumed the access to Ingest for custom schemas would satisfy that crowd. Unfortunately we dropped the ball on emphasizing how Ingest works, and, when we do explain it, it only leaves people half-satisfied. The ultimate question is this whether people are comfortable with a core set of schemas.

If WebDB is just a stopgap for the more generic solution and it's API will eventually be "explained" I can kind of understand moving forward with some version of WebDB, however I think it produces some knee-jerk reactions for some developers like me

This part is hard to explain. WebDB is just an Ingest that's run inside of Beaker with preset schemas. You can interact with it entirely by using Ingest inside of a Web app using your own schemas. The reason we proposed WebDB was because we can get better results with baked-in opinions. Again, as you know, our thinking was that we could gradually give all those benefits through fully generic userland solutions, but that would take much more time than we have now -- and so WebDB was our shortcut to a better result.

For posterity, let me step through the benefits. They can mostly be summarized as "problems that we don't have the resources yet to solve generally."

Performance

There are two performance benefits. The first is explained in 0003: Ingest requires applications to construct indexes on disk. A builtin WebDB would have centrally-managed indexes, and so apps wouldn't have to duplicate those indexes in their own storage.

The second performance benefit was not mentioned in 0003 but it's equally important: a builtin WebDB can begin crawling and syncing as soon as Beaker is started. Without that, any application using Dat will have to crawl & sync when it is opened, forcing the user to wait for loading to finish. I think users will tolerate that, but it's a bad experience.

For both of those issues, without a builtin WebDB, centralizing the indexes will be difficult. It's very hard to know when two applications are using the same index definition.

Correctness

Next, there's a benefit of correctness: right now, the DatArchive API doesn't have a way to do atomic read-update-writes, and so two Ingests running in two tabs could clobber each others' updates. A central Ingest would be on one thread, and wouldn't have that issue.

Permissions

Next, there's the benefit of fine-grained permissions, which is discussed in 0003. I think we'll be okay punting that question for now if we must, but it will take time to develop a generic alternative. For users to understand what's happening, fine-grained permissions need to operate at the semantic/object level, not at the files level.

Compatibility

Finally, the compatibility reason. There is no easy solution to the topic of compatibility. My worry is that we're going to have a problem of data siloing, not because applications are hiding data from each other, but because applications simply can't read each other's data.

The idea for WebDB (whether builtin or not) is to give us a solid foundation of common types that the community can agree upon. We only outlined a few types in 0003, but our plan is to create a full suite with internal extension-points. It'll be a pain, but having WebDB (builtin or not) will give us a place to discuss and define these core models.

Ok, let me address some of your points directly

I'm not used to a browser having strong opinions about the contents of my app

The question we're asking is, "Is now the time to change our expectations of what a browser does?"

Even with enforced data-types, I feel like there will still be compatibility issues

You can always go "off spec" if you're willing to risk misinterpretation. WebDB isn't a technical solution; it's a political / social one.

In the following example, what is the service?

I'll put out a new proposal soon that covers some of your spec questions. Right now, the service will be a preselected token. I haven't decided for sure, but I've been thinking 'user-dat' for files-level access to the user dat, and 'webdb' for webdb-level access (if it stays as a builtin). In the future, service will support URLs which identify the endpoint's protocols, and will be used to match against a registry in the browser of the user's saved services.

I'm open to suggestions on the navigator.sessions API, but that's what I went with.

Other than pass a ServiceSession into a WebDB constructor what else can I do with it?

Its ultimate goal is to provide whatever credentials you need to run a session. That'll probably always mean a ServiceSession is an opaque session identifier which can be passed to endpoints. You can .destroy() a ServiceSession to end it early.

On twitter you mentioned:

apps can still ask for full write access to a user dat and then create their own indexes What would that look like?

I missed that. As I said above, you'll probably use the 'user-dat' token in the navigator.sessions API, and then use the returned session object to get the dat. Something like this:

var session = await navigator.sessions.request('user-dat', {permissions: ['user-dat.write']})
var userDat = new DatArchive(session)

webdesserts commented 6 years ago

@pfrazee I look forward to the spec updates. I think for now I want to step back and explore Ingest in some personal projects. I'd like to wrap my head around what all it can do a bit better.

In my mind the reasons that we are in silos right now is not just technical, but its also because humans just tend to disagree on things. Even the same human might disagree on the best solution given enough time. I don't want to beat a dead horse, but I really truly believe that the only solution that will work is one were someone can "publish" a schema that people can "subscribe" to and say

This is what I believe to be the ideal structure of a human's (blog | todo list | entire medical history) at this point in time.

I understand that that's the problem of the century, but I think siloed data for a little bit won't hurt anyone considering:

That's already a problem in the current web
Unlike the current web users would actually have access to all of these silos (possibly all in one location!)

I could totally see a temporary ui where we list out all the applications that you've allowed access to a section of your user dat and say "This app is using this much storage". Would be a pretty good transition from current operating systems. Then later we can drop the game changing stuff and... BOOM! Rock everyone's world.

Quick question: With the current proposal, how would something like Rotonde query the userdats of other users you follow?

pfrazee commented 6 years ago

I'm interested to see what an explicit schema publish/subscription system can look like. I'm going to spend some time on Ingest today, and I'll see what it would take to build a declarative schema format. (Currently Ingest's schema definitions use functions.)

We basically have two objectives that we have to balance against a realistic timeline. 1) open and generic, 2) approachable and useful in the near term. If we had infinite time and money, we could completely ignore 2, but we're actually pretty resource constrained. If we can't get 10x our current growth by spring, the project will be in trouble. I'm concerned a userland WebDB won't give us the growth we need.

With that in mind, I'm going to explore an iteration of the WebDB spec today which is builtin, but which uses schemas which can be specified by applications. That way we can retain the benefits of WebDB as is, but remove the baked-in opinions. If we can come up with something that's realistic to deliver in a month, then we should go for it.

Quick question: With the current proposal, how would something like Rotonde query the userdats of other users you follow?

Depends on what you're trying to query. Here are some examples from 0003:

await webdb.profiles.get(bob)
await webdb.profiles.listFollowers(bob)
await webdb.timeline.listPosts({author: bob})
await webdb.votes.listBy(bob)

pfrazee commented 6 years ago

For reference, the WebDB spec is based on our current usage of Ingest internally in 0.8. Here's what its definition looks like: https://github.com/beakerbrowser/beaker-profiles-api/blob/2f08e5519803dd00d6d22b91d04828502fc0cd40/index.js.

pfrazee commented 6 years ago

With that in mind, I'm going to explore an iteration of the WebDB spec today which is builtin, but which uses schemas which can be specified by applications.

This is one of two options, by the way. As I mentioned in my wishlist, we want Beaker to eventually support long-running service processes. So, option 2 is, rather than trying to solve schema definitions, we do a little work on the services layer and make it clear that WebDB is a userland service.

pfrazee commented 6 years ago

Okay, I think we're just going to stick with a userland webdb module for now.

Here's my first pass at its interface: https://gist.github.com/pfrazee/8d99837aa3867bc2f4fe00a280f7a5c7

Next I need to think through how user-dat sessions are gotten and how they interact with this approach to webdb.

pfrazee commented 6 years ago

https://github.com/beakerbrowser/webdb the webdb userland module v1 is ready to go. Going to test it out a bit more, then publish.

pfrazee commented 5 years ago

Closing, we moved away from builtin webdb

beakerbrowser / specs

What would the Shared Data Layer look like without WebDB? #4

Why I'm Uncomfortable with WebDB

The Shared Data Layer without WebDB