Open TechAuditBI opened 8 months ago
I agree this would be a good addition to Superset, other products have this out of the box - see the respective docs from Tableau and Metabase detailing what reports they offer, who can see them, etc.
I would suggest only Admin role users can view these reports, by default. Unless that requires RBAC to be enabled - maybe someone has a concrete suggestion for how to handle the sensitivity of this reporting.
@supersetbot orglabel
Migration Plan and Compatibility
I don't think there's anything to migrate here, per se, but you did mention that there'll be a new command. I assume this would be run like the load examples
command?
Rejected alternatives
It seems like this is querying the metadata table directly... which might have performance implications depending on the size of those tables. Perhaps we should consider/reject other sorts of ETL/pipeline as an alternative process, as opposed to materialized views? I.e. do these materialized views have a performance cost compared to other approaches?
Yes the new command probably should run like the load examples
one.
Data storage does require further investigation especially performance wise. Currently on our installations we use a procedure approach. Once a day during the least active period (around 3 am) the procedure runs an update script on a bunch of separate tables. This allows to avoid performance drops during peak hours. But there might be a better approach.
Also speaking of examples. Maybe it would be better to split metadata and examples in separate schemas by default? Because it makes me sick looking at all the mess in a resulting db. Yes I know that there is a config parameter that allows to use a separate db connection for examples but usually it is being ignored... So maybe we should make it a bit more structural even by default.
Does this need to move forward as a VOTE thread? If you have time to contribute it, I'm happy to make the vote happen.
re: "about a billion variations of this dashboard" ... looks like Stephan Claus and team at HomeToGo are about to release a new version of their dashboard, V1 article here.
Is there any documentation on what is logged currently in the metadb ?
And following on, "Currently solving this task would require a bunch of work on levels from transforming raw logs", for context it would be helpful to know which log we would be referring to, at least as a starting point.
We need to figure out what metadata databases we actually support officially if we're going to build a built-in dashboard dependent on it.
@TechAuditBI any interest in continuing to move forward with this SIP? Hopefully we can get it ready for a vote soon. This would be an amazing example to have built into Superset, but it seems there's still quite a bit of detail to flesh out in order to pull this off.
[SIP-122] Proposal for creating an integrated monitoring and user activity dashboard
Motivation
Probably most of companies using Superset do need to gather and analyse some data about platform's health and user activity. Currently solving this task would require a bunch of work on levels from transforming raw logs to creating a dashboard. "Big" BI tools do provide such analytics out of the box so I think we should create smth like that in Superset.
Proposed Change
The whole thing can be divided into 3 parts:
DB changes Raw logs info is not suitable for building a dashboard so some sort of a datamart should be created. In case of PostgreSQL I think that it would be nice to have a materialized view with all the necessary data and a procedure to make all the transformations needed. Such a structure would provide an easy way to create an ETL to move the datamart elsewhere if needed while still being not too complicated.
Dashboard itself I'm quite sure that there is somewhat about a billion variations of this dashboard created by users so we just need to get all the popular stuff together and leave some space for specifics.
Deployment We all know and love example dashboards. So maybe using similar mechanism for analytics dashboard deployment can be used. It shurely should be optional and also separate from examples. Here we need to think about many little things. Maybe there should be some parameters brought out to the config file and so on. To be discussed further.
New or Changed Public Interfaces
Well, a new command like "superset load_monitoring" will definetely be needed. Otherwise maybe some new API endpoints would be nice but I'm not shure about this yet.
New dependencies
Probably not needed
Migration Plan and Compatibility
Pls help)
Rejected Alternatives
None as for now