Open ydaponte opened 2 months ago
STUDYING OUR DATA SOURCE - SYSTEM VIEWS IN DATAWAREHOUSE
What are we considering as a stale content in terms of Data warehouse query and performance? The system views will provide information about:
Query Performance Analysis
Query Optimization and Tuning
User Activity Monitoring
With the above information, customer will benefit from creating reports for analysis and statistics but not for archival.
In the Fabric Warehouse, we have the Query Insights that is automatically created. Query insights - Microsoft Fabric | Microsoft Learn
Comparing the feature Query Insights with the system views, there are some similarities (or one is complementary to the other). From what I could understand, the Query Insights is using the DMV's and returns a more complete information. The same if happening if we consider the Query Activity.
Query insights - Microsoft Fabric | Microsoft Learn Monitor connections, sessions, and requests using DMVs - Microsoft Fabric | Microsoft Learn Monitor your running and completed queries using Query activity - Microsoft Fabric | Microsoft Learn
The fact that you are able to find the same information in these different tools is highlighted in the Query Activity documentation.
As per my experience, the customer will always want to have access to this data for monitoring and tracking purposed. There is no benefit in archiving this data specially because we only save the data for the past 30 days (this is for the Query Insights, Query Activity).
For the DMV's, I am not sure about the limits but I would say it is same as the ones in Synapse. Please refer to the following documentation: System views supported in Synapse SQL - Azure Synapse Analytics | Microsoft Learn Capacity limits for dedicated SQL pool - Azure Synapse Analytics | Microsoft Learn
SUMMARY: • We need to evaluate what we would consider as a stale content for the Data Warehouse system views and query execution. • From my perspective, there is no need to identify any content here as stale or with the need of archival because this will be done automatically (the telemetry will expire after 30 days and will have a limit for 10.000 records, for example).
NEXT STEPS: Try to correlate the metadata information with the query text to verify if a table has not been used. (For example) - This will give us the information about idle resources.
As a Developer, I want to gather and utilize data from System Views in Microsoft Fabric to monitor system performance and query activity so that we can effectively score and tag stale content.
Resources
Queries:
Acceptance Criteria: