stellio-hub / stellio-context-broker

Stellio is an NGSI-LD compatible context broker
https://stellio.readthedocs.io
Apache License 2.0
25 stars 11 forks source link

Request for Guidance on Efficient Data Visualization with Grafana and Superset using TimescaleDB #1097

Closed michael-gollan closed 3 months ago

michael-gollan commented 5 months ago

Hello Stellio Team,

We are currently evaluating the integrating of the Stellio Context Broker into our projects and are looking to efficiently visualize historical data using Grafana and Superset. The data is stored in TimescaleDB, and we are encountering challenges with the efficiency of the table structure for visualizing large datasets.

The current structure of tables used by Stellio seems to complicate direct use with Grafana and Superset, as these tools require optimized schemas for quick querying and visualization. We have considered creating views in our database as a workaround, but we do not see this as an ideal solution due to potential performance degradation and added complexity in management.

Request: Are there any best practices or specific configurations recommended for visualizing Stellio data efficiently with Grafana and Superset?

If views are the recommended approach, are there best practices or guidelines to minimize performance impacts? We are open to any suggestions or recommendations and would appreciate any support or guidance you can provide to better meet our data visualization needs.

Thank you for your support and the excellent work on the Stellio Context Broker.

Best regards, Michael Gollan HYPERTEGRITY AG

bobeal commented 4 months ago

Hi Michael,

Sorry for late reply.

Actually, we are using a lot Grafana and Superset in our typical deployments.

When using Grafana, we get data directly from the DB (as Grafana is a real-time oriented tool, this is by far the most efficient way to work with). And we are doing using such query:

SELECT
  time AS "time",
  measured_value AS "temperature",
  temporal_entity_attribute.entity_id as entity_id
FROM attribute_instance
LEFT JOIN temporal_entity_attribute ON temporal_entity_attribute.id = attribute_instance.temporal_entity_attribute
WHERE
  $__timeFilter(time) AND
  temporal_entity_attribute.attribute_name = 'https://vocab.egm.io/temperature' AND
  temporal_entity_attribute.entity_id in ($Room) -- where Room is a dashboard variable
ORDER BY 1

For Superset, this is getting a bit more complex since we typically want to apply a denormalization process on the data. To do this, we use Apache NiFi (https://nifi.apache.org): using the NGSI-LD Subscription API, a notification is sent to a flow in NiFi when an entity of interest is updated, then the flow retrieves all the entities that the entity is in relation with (its relationships) and export the result in a "big" table which is easier to work with a BI oriented tool like Superset.

The specific NiFi processor we are using to do this is in https://github.com/easy-global-market/nifi-ngsild-postgresql. It is unfortunately missing some documentation right now but it can be an opportunity to at least add some instructions on how to set it up and use it.

Let me know.

Benoit.