trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

Feature request: support placing short running tasks on spot instance nodes #15289

Open puchengy opened 1 year ago

puchengy commented 1 year ago

Spot instances are cheap but usually short-lived.

Imagine we have a Trino cluster mixed with spot instances and regular instances. Can we place certain type of tasks (historically short running tasks) on spot instances?

Questions:

  1. how much benefits we might get?
  2. is there such a feature already planned in the community?
findepi commented 1 year ago

cc @sopel39

sopel39 commented 1 year ago

@puchengy take a look at fault tolerant execution (FTE, code name Tardigrade) mode of Trino. When that mode is enabled, you could use spot instance.

cc @arhimondr @losipiuk

arhimondr commented 1 year ago

Yeah, however coordinator is expected to be run on a stable instance

raunaqmorarka commented 1 year ago

Even with FTE there could be some benefit to preferring the volatile nodes for scheduling shorter running tasks as there will be lesser wasted work when the spot node is reclaimed. I don't know if it is possible to detect "short running" task before scheduling though. Maybe the size of input data to the task is good enough proxy to determine that.