ArroyoSystems / arroyo

Distributed stream processing engine in Rust
https://arroyo.dev
Apache License 2.0
3.81k stars 223 forks source link

Maintenance for long running clusters #562

Closed harshit2283 closed 8 months ago

harshit2283 commented 8 months ago

Currently Arroyo stores all metadata for checkpoints in checkpoints table, overtime it grows upto a significant size which may impact performance.

Screenshot 2024-03-11 at 15 29 51

Immediate solve (by @mwylde ) is to run this query manually ->

DELETE FROM checkpoints
WHERE checkpoints.id != (
  SELECT id FROM checkpoints
  WHERE job_id = '{{ JOB_ID }}'
  ORDER BY finish_time DESC
  LIMIT 1
) AND job_id='{{ JOB_ID }}';
jacksonrnewhouse commented 8 months ago

Just merged the fix