Closed fkleedorfer closed 6 years ago
I think we should really break this down in seperate tasks and merge work on them into a feat_load_selectively
branch.
A breakdown of steps might be as follows:
These intereactions should be introduced as new action-creators that load the data and then dispatch new actions of those new action-types. Some of these actions are triggered by clicks on the "more"-links, some by scrolling (e.g. in the feed)
A hook in or – if the former is not possible – wrapper for the routing-change/stateGo
-action-creator would probably be the best place to trigger the loading for the first items visible in a given view. A third option would be to have an agent, that looks at the route, checks if these first items are present and if they aren't, dispatches via the respective action-creator.
The spinning wheels can be created by tagging the ownNeeds/post/connection-objects in the state with a isFetchingMore
boolean – which also would be a better way to handle the "Pending…" in create-post.
The root-level items in the list above should be seperate pull requests.
The bottle-neck atm: the crawling on the server:
counterparts and events that aren't included in the deep request:
Two different goals:
timeoff
parameter with the original page-load time to make sure the ranking and thus the slicing is constant[in progress] Search for solutions:
won.js
and linkeddata-service-won.js
that utilize a client-side rdfstore in the background.Other thoughts:
fetching
-flags should be true in the initial state.It probably would make sense to start with the feed as pain-points will most likely show up there first anyway.
Documentation for our linkeddata-paging API
won.fetch
. However several parameters don't work yet (e.g. deep
for event-containers, type
, timeof
), but at least they can be more easily tested now.Example usage that fetches page 2 with pages of seven events each:
won.fetch(eventContainerUri, {
requesterWebId: reqWebId,
pagingSize: 7,
queryParams: { p: 2, deep: true }
})
.then(args => {
const uris = args['@graph'][0]['rdfs:member'].map(e => e['@id']);
console.log(uris)
})
The notes I took while discussing this issue with @fkleedorfer:
most difficult cases "only load 10 posts with the newest updates" and "last 10 matches over all needs"
owner has to cache need info (e.g. what has been seen, when the last updates of which type has happened, etc) necessary to implement "only load 10 posts with the newest updates".
does this ultimatively require the owner to cache the entire node-state for that user? how to avoid huge startup-cost, i.e. loading needs for every user? don't load data for users that haven't logged in for a while? only load delta since last poll?
client-to-server = marking things as seen: POST list of eventUris server-to-client: bloom-filter? list of unread uris (might be long)? list of latest unread + aggregated numbers? a special message if another client-instance marks the event as seen?
When previous requests have been finished and no new action triggered, the client could load preemptively -- e.g. load connection with new unread events (as they might grab the users attention) or after the list of conversations has finished loading, start fetching the first few conversations. We just need to make sure, user-triggered actions always have precedence. The agent can publish multiple actions for each of these loaded data-packages.
Most of the loading should happen through the more declarative construct queries!!
always use deep=true
to resolve collections
Note that "the last 10 events" (N) isn't the same as "the last 10 events the user gets to see" (N'), the difference for example being success messages. Until the server-API reflects this issue, N = 3 * N'
can be used.
getRequiredData
)? nvm, "only load latest" is a seperate concern -- the components should only deal with translating state to html.(action, state) → (cmd, state')
as opposed to (action, state) → state'
construct_query
+ query + paging-parametersgetNode
/ getNeed
+ paging-parametersfetch
+ paramsthe events received via the websocket should be pushed to the rdf-store, including rdfs:member
entries, and they should be marked as cached.
if dirty: only load beginning with latest member automatically to avoid unnecessary server-load.
selective loading and caching:
But actually we shouldn't need this smart caching as we get all necessary information through the web-socket. Everything should be either initially fetched at page-load or in case the user clicks "more" be connections/events older than the previously oldest uri – thus a cache-miss anyway. We could implement checks as the ones above to detect non-well-behaved code though. The only exception is connection-messages we send ourselves, that we need to invalidate and fetch after posting to get the timestamps from our owner. So the simplified variant is: "always reload if it's a mutable ressource (i.e. collection)"
Note: the won.deleteNode(uri)
somewhere around linkeddata-service-won.js:~790
might be problematic, as in it's current form it deletes the entire container when fetching a new page.
It's ignored if the data loaded was a partial only, i.e. paging was used. Thankfully the store doesn't add duplicate nodes, so we simply add the triples the usual way. The only possible tripping hazard is blank nodes. These are always given unique identifiers and thus always result in unique triples.
Here's an example redux-app that uses pagination (on github-API content): https://github.com/reactjs/redux/tree/master/examples/real-world
The following relay-example on paging might provide inspiration on how to embed crawlable queries in components: https://www.reindex.io/blog/redux-and-relay/#relay-4
For the decisions:
The Elm-Architecture would need too much refactoring right now or introduce another style and thus increase the complexity of the code-base. Implementing it as an agent would require to specify the data-dependencies twice or traverse the currently visible component tree on every update. Also it can only look at the state, in particular the routing parameters, not actions (e.g. a person clicking on "Show more" or scrolling down. Thus the loading will continue to happen in asynchronous action-creators. But instead of having one automatic call as part of the page-load, the components will call some kind of requestData
-action-creator every time (additional) data is required, e.g. while the component is initialized, a critical routing parameter changes or when a person requests more data. These action-creators should be kept as small as possible, though!
Uris will not be marked as cached if they were only fetched partially, i.e. using pagination – even if multiple partial fetches should cover the entirety of a collection.
Components will now not only know what data they need from and where it is in the state (encoded in the select-statements they use) but also where they data is in on the server and in the rdf-graph. This information will be provided to the action-creator mentioned above. Ideally it's encoded purely declaratively and both the crawlableQuery and the select-statement can be drawn from that info
The order of operations is:
@constructor
actionCreators.ensureLoaded(<dependencies>)
executeCrawlableQuery(<query>).then(data => {… dispatch(<dataRequestedAction>) …}
select
, that is drawn from the components dependency declaration.There's a spaghetti-code snippet in the branch, that makes sure all events of the connection are loaded whenever the conversation is accessed (I'll refactor it to conform to the architecture specified above, once all edge-cases have been found).
One of these problems: Unless all queries to a connection pass along the paging-size, all event-uris get loaded into the rdfstore. Followingly, all queries that go on to load all referenced events will then load all events. Even if a paging-size was specified for the query. This will be a tough one to avoid:
deep
and with paging; then the rdfs:member
s in the store should map directly to also loaded member-nodes. In any other case the container shouldn't be resolved anyway. As won.getConnection
currently resolves the event-container, all uses of that function need to be changed, so they'd also work with won.getNode
directly; that and the handling of the hasEventContainer
-property in the connection-reducer.The connectionMessages gotten via the websocket should be added to the rdf-store as well (including a rdfs:member
triple)
pretty sure we can close this, since we implemented some of that with the skeleton screens and so on, i think this issue is obsolete now, or should be boiled down to something more specific (as its own issue) @fkleedorfer if you agree please close this issue
closing this, if we need to do more we will create separate issues for it
Instead of loading everything on startup, determine what needs to be loaded and load only that.