Open knwg145 opened 5 years ago
Thanks for the thought. Since many people are interested in enhancing the warning system, we could turn this into a roadmap item. Let's try to work out what the warning system should look like:
Bad queries (ie. using deprecated UDFs or consuming too much resources) should generate warning(s) to notify system admins or users to take further actions. Depends on the importance of the warning, it should be classify into different categories and levels. System should react to the warnings by taking proper action, such as kill the query, notify external systems, etc.
From Presto Engine:
From connectors:
During query optimization, some information may be added / lost due to the transformations. Runtime statistics could trigger more warnings. For example, the splits information are loaded async, the total amount of data to be scanned are unknown during the planning time. After query completed, the system will have all execution statistics which could trigger even more warnings.
Although the best way is to prevent the harmful queries being executed at all, it is not possible to detect all possible issues during planning time. Detecting issues in runtime and after completion can give us a more complete picture.
IMO, warnings generated by connectors should be classified by their nature. Connector information should be provided as additional information instead of acting as a category.
I'm still thinking what should be the best way to config the warning system. We need to figure out how to preserve the modularity of the connectors and provide a common configuration system to pass the config to the sub-warningCollector in each connector.
I'd spent some time thinking about this when we originally designed warnings and diverged from this in a few ways:
We are currently trying to create a more extensive warning generator to warn our Presto users and give them feedback.
I have separated warnings generation into 3 possible categories:
Warnings for deprecated tables, deprecated views, etc or other problems related to them can be very useful to let users know about ongoing issues in production. In order to enable such warnings in real-time, we are working on injecting and refreshing configurations for such warnings through an external database (rather than storing them in a variable or a file, which requires deployment). In this case, we have an external MySQL database that stores such information and warnings will be generated dynamically based on the values in MySQL.
For warnings related to query execution statistics, it is helpful for us to alert users when their queries’ resource usage is above a certain threshold. In this case, we are generating warnings if the memory/cpu usage of a query is above a certain percentage of the maximum amount possible. This is useful for cases like scheduled queries operating on an ever increasing dataset. These warnings will allow the user to know if there might be problems with a query in the future and to tune it accordingly. Another case where these query statistics warnings are useful is for incremental query building. Users like to add on top of existing queries and generating warnings will warn them to be careful about building more on top of the existing query. In addition, this can encourage users to try and modify their queries to be more efficient (one case that comes to mind is using too many CTEs).
For the hive connector, we have a couple of warnings that we want to generate. These are:
We want to generate these warnings in the hive connector to keep it modular and extendable.
For the connector warnings, I was thinking about having a WarningCollector for each connector within the WarningCollector. Maybe some kind of mapping of
CatalogName : ConnectorWarningCollector
This connector specific warning collector can be passed in through the function signature whenever there is an engine to connector interaction. Warnings generated in a connector can be added into this specific warning collector and the information will be available to the presto engine. This will allow the presto engine to fetch warnings already generated from any connectors at any time.
There is a similar effort going on here: #1006