trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.21k stars 2.94k forks source link

DFs received by coordinator are unacked #5429

Open sopel39 opened 3 years ago

sopel39 commented 3 years ago

Currently, coordinator fetches DF delta between local version and worker version. Coordinator sends it's local version as part of DF request. Worker is free to remove stored DFs based on coordinator local version. However, coordinator does not acknowledge received DFs until next DF is available. Therefore workers can keep DFs locally longer then needed.

This can be resolved by worker artificially bumping DFs version until there are still unacknowledged DFs

macohen commented 9 months ago

I'm new to Trino, but I'd like to help. Can you provide pointers on how to start looking into this? Where is the code that handles this? What is "DF?"

For me, I'd like a "good first issue" to contain points like this to make it easier to get started.

sopel39 commented 9 months ago

@macohen I would start looking at HttpRemoteTask and DynamicFiltersFetcher first. Specifically how DF transfer protocol is implemented