tarantool / crud

Easy assess to data stored in vshard cluster
BSD 2-Clause "Simplified" License
40 stars 15 forks source link

select/pairs execution plan #348

Open DifferentialOrange opened 1 year ago

DifferentialOrange commented 1 year ago

From time to time, application developers ask: "What's the cost of this request?" And the answer is not always clear even for a crud developer.

select/pairs could be used for requests more complicated than just a primary key get, especially for paginations. Will this request trigger map reduce? Will it trigger tuple fullscan on scroll? Due to implementation, sometimes the order of conditions may change the result. Based on this info, application developer may realize that they need a different index or sharding architecture. Some of this info already presents in metrics, but could be hard to get a grip of what exactly is going on with a single request from the aggregated info.

Some planning info already presents on router: see plan.lua. It's not exposed anywhere yet. It contains the info about which storage or storages would be used and which index would be used to select. We may extend it with sharding key info. Some planning info, like tuple scroll, could be extracted only from storage. We don't have an explicit plan table there yet, but it is possible to built in without any drastic additional cost.

The proposal is to allow a user call crud.select/crud.pairs with {dry_run = true}/{plan = true} (or maybe with new handle select_plan) and return a table with exhaustive info about what this request will do. Since this option would be used only for debug, one of the approaches that could be used to build a plan is to actually execute the request and write down the useful info from the storage-site, but it seems preferable to implement an actual dry run (so user may check potentially dangerous requests too).

We shouldn't forget to skip such requests in metrics monitoring info (or decide that we will track then, but the question should be raised either way).