earthstar-project / earthstar

Storage for private, distributed, offline-first applications.
https://earthstar-project.org
GNU Lesser General Public License v3.0
633 stars 20 forks source link

Improvements to querying documents for sync filters #46

Open sgwilym opened 4 years ago

sgwilym commented 4 years ago

What's the problem you want solved?

earthstar-graphql has rough support for sync filters, and this feature has been a little hairy to implement.

As of writing, earthstar-graphql's sync filters are shaped like this:

{
  pathPrefixes: string[],
  versionsByAuthors: []
}

The idea is that the peer/pub will return documents that match ANY of these rules.

However, querying documents works like this:

workspace.documents({
  pathPrefix: "/something"
  versionsByAuthor: "@test.1234"
})

And the documents returned must match ALL of the queries.

What this means is that implementing sync filters is a little bit hairy. Here's earthstar-graphql's implementation: https://github.com/earthstar-project/earthstar-graphql/blob/master/src/util.ts#L253

It calls the documents method once for each member of each property in the sync filters, and then puts all the different lists together. This method would probably get a little more unwieldy once more properties are supported.

Is there a solution you'd like to recommend?

Could there be ways to query a workspace's documents a bit more like how sync filters operate, i.e. using OR logic, and supporting lists for each property? A new method on IStorage, or a (breaking) change to documents?

cinnamon-bun commented 4 years ago

Yeah, this is awkward. I think the IStorage query functions should accept an array of queries which would work in the same way as sync queries:

workspace.documents([
    {pathPrefix: "/about/"},
    {pathPrefix: "/wiki/"},
])

It would also still accept a single query object, or no query at all:

documents(q?: Query | Query[]) => Document[]

Matching against one query object, a document has to match EVERY property in the query. E.g. every property in a query object narrows down the search.

Overall, it would return documents that match ANY of the queries in the array. E.g. each additional query in the array broadens the search.

Would you be open to changing the GraphQL query to work this way instead of having arrays within one query object? I think this combination of narrowing and broadening will allow us to express complex queries.

cinnamon-bun commented 4 years ago

See also #47 (More comprehensive query options) for ideas about revamping query objects

sgwilym commented 4 years ago

@cinnamon-bun I like this, and I'm definitely opening to changing how filters work in earthstar-graphql. Sure it's a little more complex, but I'm not too worried as GraphQL will be able to help people understand how to make valid filters.

cinnamon-bun commented 4 years ago

I've gotten stuck: what to do with the limit field? It makes sense to limit each query inside its own object, but what if you want an overall limit?

Maybe you just can't do that.