trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.53k stars 3.03k forks source link

Create a more extensive query warnings generator #1140

Open knwg145 opened 5 years ago

knwg145 commented 5 years ago

We are currently trying to create a more extensive warning generator to warn our Presto users and give them feedback.

I have separated warnings generation into 3 possible categories:



Warnings for deprecated tables, deprecated views, etc or other problems related to them can be very useful to let users know about ongoing issues in production. In order to enable such warnings in real-time, we are working on injecting and refreshing configurations for such warnings through an external database (rather than storing them in a variable or a file, which requires deployment). In this case, we have an external MySQL database that stores such information and warnings will be generated dynamically based on the values in MySQL.

For warnings related to query execution statistics, it is helpful for us to alert users when their queries’ resource usage is above a certain threshold. In this case, we are generating warnings if the memory/cpu usage of a query is above a certain percentage of the maximum amount possible. This is useful for cases like scheduled queries operating on an ever increasing dataset. These warnings will allow the user to know if there might be problems with a query in the future and to tune it accordingly. Another case where these query statistics warnings are useful is for incremental query building. Users like to add on top of existing queries and generating warnings will warn them to be careful about building more on top of the existing query. In addition, this can encourage users to try and modify their queries to be more efficient (one case that comes to mind is using too many CTEs).



For the hive connector, we have a couple of warnings that we want to generate. These are:

We want to generate these warnings in the hive connector to keep it modular and extendable.

For the connector warnings, I was thinking about having a WarningCollector for each connector within the WarningCollector. Maybe some kind of mapping of



CatalogName : ConnectorWarningCollector



This connector specific warning collector can be passed in through the function signature whenever there is an engine to connector interaction. Warnings generated in a connector can be added into this specific warning collector and the information will be available to the presto engine. This will allow the presto engine to fetch warnings already generated from any connectors at any time.



There is a similar effort going on here: #1006

oneonestar commented 5 years ago

Thanks for the thought. Since many people are interested in enhancing the warning system, we could turn this into a roadmap item. Let's try to work out what the warning system should look like:

Enhanced Warning System

Objective

Bad queries (ie. using deprecated UDFs or consuming too much resources) should generate warning(s) to notify system admins or users to take further actions. Depends on the importance of the warning, it should be classify into different categories and levels. System should react to the warnings by taking proper action, such as kill the query, notify external systems, etc.

How warnings are being generated

From Presto Engine:

From connectors:

During query optimization, some information may be added / lost due to the transformations. Runtime statistics could trigger more warnings. For example, the splits information are loaded async, the total amount of data to be scanned are unknown during the planning time. After query completed, the system will have all execution statistics which could trigger even more warnings.

Although the best way is to prevent the harmful queries being executed at all, it is not possible to detect all possible issues during planning time. Detecting issues in runtime and after completion can give us a more complete picture.

What kind of warnings can be generated

IMO, warnings generated by connectors should be classified by their nature. Connector information should be provided as additional information instead of acting as a category.

Configuration management

What to config

How to config

I'm still thinking what should be the best way to config the warning system. We need to figure out how to preserve the modularity of the connectors and provide a common configuration system to pass the config to the sub-warningCollector in each connector.

How to response to warnings

raghavsethi commented 5 years ago

I'd spent some time thinking about this when we originally designed warnings and diverged from this in a few ways: