SolidLabResearch / Challenges

24 stars 0 forks source link

Aggregators to improve data access across many pods, a social media perspective #24

Closed pbonte closed 1 year ago

pbonte commented 2 years ago

Pitch

Applications that require to aggregate data across many pods can be faced with slow response times due to the latency of data retrieval and processing of the large number of pods. This is typically the case in a social media scenario, where the timelines of their users are curated based on the activities of their contacts. Computing these timelines when the users access their social media applications is typically not feasible due to latency constraints. Therefore, the timelines should be precomputed as a form of aggregation. The SolidBench.js benchmark will be used to simulate data pods with social media data.

Desired solution

A first version proof of concept aggregator server that functions as an intermediate compontent in the Solid network and accepts queries from client side applications and directly exposes the result of the queries, i.e. the computed bindings. This allows client applications to retrieve the query results directly from the aggregator instead of evaluating expensive queries themselves. This aggregator server should compute the bindings and keep them up to date when changes in the resources occur. In other words, make sure that changes in the resources reflect in possible changes in the resulting bindings of a specific query. As a proof of concept this can be done by re-evaluating the queries every time a resource has changed. (In later optimisations this could be done by using incremental query execution techniques. Also as this is a proof of concept no authentication needs to be considered.)

The solution is required to:

Acceptance criteria

Show the speed increase for the query evaluation between client side query evaluation and using the aggregator server by using the SolidBench (https://github.com/SolidBench/SolidBench.js) benchmark.

A demo that showcases this solution would need to be able to:

Assumptions

As the topic of aggregation is still a novel research topic, a number of assumptions were taken:

pheyvaer commented 2 years ago

In the Solid Calendar Store there is a store that allows to pre-generate the representation of a store. This is done to reduce the response time when an agent request a calendar, because sometimes this might take up to 30 seconds if multiple calendars are combined. This might be a pointer or serve as inspiration for a demo.

pheyvaer commented 2 years ago

@pbonte @fongenae Did you have the chance to look into making the necessary changes?

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

maartyman commented 1 year ago

The proof of concept aggregator is done. A server has been made that accepts queries in the form of post requests and is able to perform these queries using comunica. The used resources are then observed for changes using web sockets (version 0.1 of the solid web socket spec) or polling, and the query is reevaluated if the resources change. The query results are then made available by the aggregator through a GET request (snapshot result) and web sockets (bulk result + constant update stream).

I'm now working out some bugs and making some demos using the SolidBench data and queries.

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

maartyman commented 1 year ago

I'm working on closing this challenge.

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

maartyman commented 1 year ago

Demo repository link: https://github.com/maartyman/solidBenchAggregatorDemo

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

pheyvaer commented 1 year ago

You find the report for this challenge here.