linkeddata / rdflib.js

Linked Data API for JavaScript
http://linkeddata.github.io/rdflib.js/doc/
Other
564 stars 143 forks source link

Natively Support In-Browser RDF #538

Open jeff-zucker opened 2 years ago

jeff-zucker commented 2 years ago

Rdflib can not currently be used directly on RDF data stored in a browser's local storage because 1) default fetches all barf on any URL that isn't http(s) or file. 2) rdflib needs to see a content-type header to know what to do with documents it fetches 3) rdflib needs a way to know if/how a document is editable before it can write to it.

Simple usage can get around some these problems by using a 'MachineEditableDocument' triple, or by using getItem + parse instead of load and serialize + setItem instead of putBack. @NoeldeMartin's apps use this approach (get+parse/serialze+put) though not with rdflib.

But even with these work-arounds, update can not work - it calls load and putBack with no possibility to replace them. Similarly the forms system can not work - it calls update internally with no possibility to replace it.

I have created a simple fetch that provides what rdflib needs to see in order to work directly on the local storage (it sends a full Response object with wac-allow and content-type headers). This means that rdflib methods including load,putback,update,webOperation and the forms system can be used against documents in the local storage.

A user or app would need to allow this storage by setting inBrowserStorage to true when creating a fetcher. Once they had done that, their http and file fetches will work as previously but fetches in the form "ls://path" will read or write path in the browser's Local Storage.

const fetcher = $rdf.fetcher(kb,{inBrowserStorage:true});                           
await fetcher.load('https://example.com/foo.ttl'); // loads remote file      
await fetcher.load('ls://foo.ttl');                             // loads  in-browser file 
josephguillaume commented 2 years ago

I like it! It's an elegant solution for rdflib to work with documents in local storage without having to use solid-rest and browserfs.

In terms of use cases, it would be nice if this could play well with a local-first strategy. With the proposed solution, an app would need to maintain two different sets of URIs. Presumably it could use a convention to replace ls by http/https to remember where it wanted to save the triples.

An alternative would be to maintain two separate stores, with one exclusively using this localStorage fetch. The advantage would be that the original why URIs could be kept.

Rather than committing to the use of ls:// now, perhaps we can first work out what the next step would look like.

Using keep/revert conflict resolution (I'm not an expert), updater.updateLocalFirst (or some other name) would need to know about the ls and http storages. At initial http load, the ls storage would need to be updated, i.e. making the data available offline. When an update is made, ls is updated immediately. If http is offline, then the two are now out of sync. When we try to sync, if the remote etag is out of date then the app needs to decide whether to do an explicit revert of the remote version, do some custom merge, or discard the local changes.

https://github.com/remotestorage/remotestorage.js/blob/0e6ef757e6fd2d5c067207cb07b7d62e820a58ec/doc/contributing/internals/cache-data-format.rst#keeprevert-conflict-resolution

With the single store solution, we'd need to ensure that every http triple (meeting some criteria) has a ls equivalent. With a two store solution we'd be checking that the contents of the two stores is identical. Not sure what will be easiest but it seems like it might inform the API choice here on how ls should be accessed?

We might also run into limitations of local storage in this use case, in which case we probably want to make sure the API choice has a clean equivalent with indexeddb too.

jeff-zucker commented 2 years ago

This is rather elaborate, but I believe handles the local-first use case as well as supporting any kind of alternate fetch now or in the future using any URL scheme.

By default, rdflib will handle http(s): fetches with cross-fetch and [Edit: if we want this to be on by default] will handle browser: fetches with its new built-in rdflib.fetcher.inBrowserFetch. Users/apps can override those fetches and support other fetches by passing a flag on fetcher creation that would look like this:

  const fetcher = $rdf.fetcher(kb,{schemeHandlers:{
    http       : solidClientAuthn.fetch,  // includes both http & https
    file       : solidRestFile.fetch,
    browser    : solidRestBrowser.fetch,
    all        : yourLocalFirstOrOtherFetchHere;
   }});

If the scheme "all" is not defined, the scheme of the URL determines which fetch is used. If "all" is defined, all schemes are handled by the named all-fetch. That method can make use of the other kinds of fetches by calling fetcher.schemeFetch(scheme,uri,options). For example, this would look for the named URL in the browser local storage even though it is in the http: scheme.

  fetcher.schemeFetch('browser://localStorage','http://example.com/foo.ttl');

So a local-fist app could address the local with schemeFetch('browser',...) and the remote with schemeFetch('http'...) but use the same http: URL for both. Something like this :

  allFetch = (uri,options)=>{
     const localContent = schemeFetch('browser://indexedDb/',uri);
     const remoteContent = schemeFetch('http:',uri);
     // do sync stuff
  }
jeff-zucker commented 2 years ago

I implemented something like the above in Solid Rest Browser.

NoelDeMartin commented 2 years ago

I implemented something like the above in Solid Rest Browser.

Looking at that example, I have some doubts. It handles the basic use-case of storing data locally and remotely, but:

I understand all of these have to be handled by the application, right? In that case, I'm not sure if it's correct to say that the library is "local-first", rather that it allows you to write data locally. But in order to have a true local first app, the app developer still needs to do some work.

josephguillaume commented 2 years ago

I would say it's a work in progress. This is the foundation of what we'll need for functionality that is at least as user friendly as remotestorage.js. The intention is not for the application to have to do all of this itself.

jeff-zucker commented 2 years ago

The purpose of SolidRestBrowser is to provide a Solid interface to in-browser storage. It can be used to create a local-first system, but it is not in itself a local-first system. In order to write local-first you need a fetch that can write both locally and remotely, hopefully using the same syntax and on top of those fetches you need syncing logic - changes, conflicts, offline states, etc. SolidRestBrowser supplies the fetches and it is up to an app or other library to add syncing logic on top of that. I don't believe there is anywhere where I claim that SolidRestBrowser is a local-first system on its own.

I have added container support (returns simple Turtle representation of folders) as well as intermediate folder creation ( put(/foo/bar/baz.txt) creates /foo/ and /foo/bar/ if they don't exist. I am about to add full N3 PATCH support. There are etags. The major things missing are POST and some way to handle web-sockets. Those types of things belong in SolidRestBrowser, specific local-first logic does not.

jeff-zucker commented 2 years ago

I could imagine a library that uses SolidRestBrowser as part of a local-first system. I'm glad to help with that effort and to make any needed changes to SolidRestBrowser but I am not going to write that library. I think keeping the fetches/storage itself separate from the syncing logic is good practice so having two libraries makes sense to me.

jeff-zucker commented 2 years ago

However, if either of you thinks the local-first logic bits belong in SolidRestBrowser, I will gladly accept PRs. :-)

NoelDeMartin commented 2 years ago

SolidRestBrowser supplies the fetches and it is up to an app or other library to add syncing logic on top of that. I don't believe there is anywhere where I claim that SolidRestBrowser is a local-first system on its own.

Ok, that's fine then, I think something simple can work and apps can have their own custom logic on top. I was confused because the title of the example in the README is "A Local-First Example", maybe it should be called something else to avoid confusing it with a full-fledged local-first solution.

jeff-zucker commented 2 years ago

I just changed the title of that example to " Dual Fetches - a basis for Local-First" and added some explanation about the relationship between the example and a full local-first system.