SolidLabResearch / Challenges

24 stars 0 forks source link

Streaming Aggregators to realise data summarisation on data streams stored in a Solid environment #84

Open pbonte opened 1 year ago

pbonte commented 1 year ago

Pitch

Data streams are becoming omnipresent, however, storing and analysing real-time data streams in a decentralised fashion using solid is still hard to achieve. This is mainly due to the high frequency of changes in the answers to the issued queries on these streams and the temporal validity of the answers. A first prototype of streaming aggregators is necessary to prepare the answers of a continuous query over streaming data for a client and keep the query results up to date. This eliminated the need for the client to process the whole stream while the aggregator allow the client to retrieve the results instantaneously. In patient monitoring system, data streams produced by personal vitality sensors and activity trackers are semantically annotated and stored in the data pods. Healthcare providers are interested in summaries of the activity of a single patient our summaries across multiple patients. Streaming aggregators are required to realise an improved data summarisation and instantaneous results as the data to be analysed in a pull-based fashion is extremely large due to the continuous dimension of the data streams. The DAHCC Dataset will be used as the data stored for each patient in a solid pod to realise the aggregators.

Desired Solution

A first proof of concept streaming aggregator which runs as a service, with whom a client application can interact. The client application can specify the query that needs to be continuously evaluated on one or multiple data streams stored in Solid Pods. The solution is required to,

Use Case

The dataset has sensor values from multiple patients. To monitor the patient's location, we use the sensors which detects the presence of the person in the house. The person detection sensor is employed in the 3 halls, kitchen and the bedroom in the DAHCC dataset. We will aggregate each patient's location in a particular window, as well as the location of all the patients. This allows to compute a summary of the activity of each patient, which is a useful insight for healthcare providers.

Acceptance Criteria

A demo resulting from the solution should be able to,

Assumptions

Compared to (https://github.com/SolidLabResearch/Challenges/issues/24), we focus on the streamming and windowing aspect for aggregation of data.

Scenarios

This is part of a larger scenario

s-minoo commented 1 year ago

A few papers that I came across might be relevant to sliding windows aggregations: 1) Cutty(2016) 2) Scotty(2018) 3) General stream slicing

The author, Jonas Traub, developed the aggregate stream slicing in that specific order

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

argahsuknesib commented 1 year ago

Aggregation of the data streams generated from a single LDES in LDP solid pod works. Working towards aggregation for multiple different pods.

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

argahsuknesib commented 1 year ago

Currently defining an ontology and using websockets to publish the results from the aggregation.

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

argahsuknesib commented 1 year ago

A repository for the demo of the aggregator is avaible at https://github.com/argahsuknesib/ssa-demo/

pheyvaer commented 1 year ago

Not all acceptance criteria are met at the moment.

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

argahsuknesib commented 1 year ago

Currently, the testing of the aggregator system is not done. In the next step, testing / benchmarking will be done. The challenge will be closed, with the results from those benchmarks.

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

argahsuknesib commented 1 year ago

Currently preparing a testplan / test environment for the aggregator's testing.

github-actions[bot] commented 1 year ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!

argahsuknesib commented 1 year ago

Preparing the test setup and writing scripts for testing. The evaluation repository (which will be updated, later with finished results) is available here https://github.com/argahsuknesib/solid-stream-aggregator-evaluation

github-actions[bot] commented 12 months ago

Please provide a status update about this challenge. Every ongoing challenge needs at least one status update every 2 weeks. Thanks!