lilactown / pyramid

A library for storing and querying graph data in Clojure
Eclipse Public License 2.0
229 stars 10 forks source link

Plans for supporting wild card for pulls? #6

Open phronmophobic opened 3 years ago

phronmophobic commented 3 years ago

This library looks neat! I was wondering if there are future plans for supporting wild cards for joins similar to https://docs.datomic.com/on-prem/query/pull.html#wildcard-specifications.

I was hoping to be able do something like the following and grab the denormalized view.

(a/pull my-db '[{[:component/id 0] [*]}])
lilactown commented 3 years ago

I want to understand how pathom would treat such a query; it looks like it supports wildcards, even though it's not documented. My goal for the pull API is to allow one to take the result of a pathom query, put it in an autonormal db, and get the same result for the same query.

My initial guess is that this shouldn't recursively resolve the entire graph, which means that it wouldn't be any different than:

(a/pull my-db [[:component/id 0]])

I've in the past contemplated (and half implemented) a lazy entity function which would recursively resolve attributes as they are accessed: https://github.com/lilactown/autonormal/commit/d89686fba88b90bfe7c88bf5683b2fa48f448763

phronmophobic commented 3 years ago

Makes sense. I guess I should include my use case which may or may not be a good fit for autonormal.

I'm working on a desktop app and I was trying autonormal as the in memory data store for the UI. I was going to put all the data in autonormal and wanted to query a root Entity that would pull the denormalized view of all the UI data. I guess that might not even make sense if the data has cycles. Maybe a lazy entity object is what I'm looking for.

lilactown commented 3 years ago

I think that should be an an excellent use case for autonormal, just not the only one when considering changes.

Yes, the biggest risk is hitting a cycle and the UI freezes. My view is that cycles exist and should be allowed to be stored in the db, and the way we deal with this is by using EQL queries which represent a tree view on the graph of data.

The lazy entity is an OK-ish solution but when I started playing with it, there were a lot of edge cases to cover when implementing it for ClojureScript and I didn't really have a use for it at the time. The DataScript implementation datascript.impl.entity is a pretty good example of the protocols and other things required AFAICT.

lilactown commented 3 years ago

Outside of supporting your specific use case, it looks like pathom does use wildcards in a specific way which autonormal should also support. From @wilkerlucio in slack:

wilkerlucio: @lilactown in Pathom, wildcard means "give me everything you loaded so far" its not gonna trigger any extra resolvers, when a resolver is called, the full resolver response is always merged, but later it gets filtered out to include only keys that the user asks, the wildcard removes that filter on that level

lilactown: hmm ok. so a client passing a query like: [{[:component/id 0] [*]}] isn't going to be any different than: [[:component/id 0]] ?

wilkerlucio: correct what may be, is something like: [{[:component/id 0] [:component/some-data *]}]

So from autonormal's POV, the * in a subquery basically means, "parse the rest of the query, and merge it in with all the other shallow kvs at this depth."

wilkerlucio commented 3 years ago

Just made new docs available about wildcard in Pathom, you can find it at https://pathom3.wsscode.com/docs/eql/#wildcard

phronmophobic commented 3 years ago

My view is that cycles exist and should be allowed to be stored in the db, and the way we deal with this is by using EQL queries which represent a tree view on the graph of data.

I'm still new to using libs like autnormal/datascript/pathom/etc so I'm trying to figure out what the idiomatic solution to the following problem:

Let's say you have a tabbed pane. The required UI data will depend on which tab is selected. You can construct an EQL query to find out which tab is selected, but you can't create the query for the selected tab until you know which tab is selected. It would probably be a terrible UI, but a tab could itself also have a tabbed pane and so forth.

Essentially, the full query would depend on data that the query would return. I don't see any way around having some way to interleave querying and constructing more queries. A lazy entity would address the incremental querying, but it's not without its own issues. I'm sure you could do something special for tabbed panes, but I'm interested in finding a general solution.

lilactown commented 3 years ago

You can construct an EQL query to find out which tab is selected, but you can't create the query for the selected tab until you know which tab is selected.

There's a few different ways to address this. One is to restrict your queries to something that can be represented via an EQL union, for instance:

{:chat/entries
 {:message/id
  [:message/id :message/text :chat.entry/timestamp]

  :audio/id
  [:audio/id :audio/url :audio/duration :chat.entry/timestamp]

  :photo/id
  [:photo/id :photo/url :photo/width :photo/height :chat.entry/timestamp]

  :asdf/jkl [:asdf/jkl]}}

See https://github.com/lilactown/autonormal/blob/main/test/autonormal/core_test.cljc#L202

The idea here is you declare up front a taxonomy of things (in my example, chat entries; in your example, tabs) and then specific queries based on that taxonomy.

This can work for relatively static UIs, but becomes very tedious and brittle due to the fact that you must write your queries to accommodate any possible UI graph. What your tab example illuminates is the relationship between the component graph and the data graph. If each component can dynamically return different children - even nested or graphs of children - across time, then you need a way to also build up that query based on what components could show up in the graph. Attempting to write a single query at the top of your app leads to defensively querying everything, which isn't what you want.

An easy way of handling this is to pass the db value to every component, and allow each component to query about what they care about.

The downside to this is that you lose some amount of reusability; e.g. it would be great if you had a user-profile component to be able to use it in the context of the currently logged in user, as well as the list of users friends. this breaks if in the component you use a query like:

[{:current-user [:user/id :user/name :user/email]}]

This is because you must fill in the entire query from the top. To make this reusable, we would want to declare just the parts that the component cares about - [:user/id :user/name :user/email] - and let the parent component fill in the surrounding query which those attributes fit in based on the context.

There are, ostensibly, two ways to do this:

  1. Pass in the full query to the user-profile component for it to manipulate - e.g. add the subquery it needs - and then query the DB for the full thing
  2. Have the parent component of user-profile ask its children what subqueries they need and have the parent ask for it on behalf of them

In practice, I've only seen (2) implemented. My hunch is that this more amenable to the sort of top-down hierarchical thinking that component frameworks (a la React) have adopted. Two examples that immediately come to my mind are Fulcro and Relay.

Both of these frameworks have a concept of a "subquery" or "fragment" that a component declares a dependency on, and then these parts are composed in a parent which actually queries the db. The child components only get a view of whatever parts of the query they declared, so this prevents components implicitly depending on things outside of its local context, thus they can be freely reused.

That was a lot of words. I hope it provides some insight into this problem. This is sort of the cutting edge of front-end frameworks right now, so I can't really say with any assuredness how it all works out in practice. autonormal isn't really in the business of solving these problems for you, but is meant to be used as a building block towards a solution like Fulcro/Relay.

wilkerlucio commented 3 years ago

hello @phronmophobic, one thing to consider here, is your tab content homogenous or heterogeneous? The simpler case is when it's homogenous (tab changes in content, but the "format" of the content is always the same). For this case you could have something like this:

; considering the tabs always show some user data
(def tab-content-query
  [:user/id :user/profile-pic :user/name]

(def tabs-query
  [; for filling the tabs titles
   {:app.tabs.list/all-users [:user/id :user/name]}
  ; for the active tab, pull the whole data
   {:app.tabs/current-tab tab-content-query}])

Then you have to decide when to load the data, you could load everything upfront, or load the tab content on focus, then you have to deal with caching as well (to avoid doing a request when you already have the data, but also cases of full reload if wanted).

For the heterogeneous it's similar, but the query itself for current-tab would have to change depending on the component you are rendering.

Learning Fulcro is a great way to get more into these patterns, since Fulcro requires that, the community there knows how to deal with things like this :).

phronmophobic commented 3 years ago

Wow, thanks for the great responses. It might take me some time to process everything.

hello @phronmophobic, one thing to consider here, is your tab content homogenous or heterogeneous? The simpler case is when it's homogenous (tab changes in content, but the "format" of the content is always the same). For this case you could have something like this:

There's been several instances where I've needed something similar. The example is trying to provide a concrete example for the heterogenous case where the full query can't be determined statically. Generally, I'm interested in figuring out a general strategy for queries that can't be determined statically.

Learning Fulcro is a great way to get more into these patterns, since Fulcro requires that, the community there knows how to deal with things like this :).

I spent a little time learning fulcro. I see https://book.fulcrologic.com/#DynamicQueries which seems like the recommended way to do it. I feel like this is feasible for some specific use cases (eg. one level of tabbed panes), but would be complicated for an example where you have two or more levels of queries that depend on data.