weaveworks / scope

Monitoring, visualisation & management for Docker & Kubernetes
https://www.weave.works/oss/scope/
Apache License 2.0
5.84k stars 708 forks source link

Use S3 for historical queries instead of DynamoDB #3839

Open bboreham opened 3 years ago

bboreham commented 3 years ago

Scope has an optional multitenant mode, where reports are saved to S3 and indexed in DynamoDB.

Once #3783 is done, live rendering will not use the store, so we will have far less time pressure. I think we can drop the index and just use an S3 'list' API call to find objects.

However we will need to change the object path-name to include the time as a prefix. Current paths are like s3://bucket-name/00002140a76ed46df4956c4af4004160/1554123600273225527, where the first part is a MD5 hash of the tenant ID and hour number, and the second part is the Unix timestamp in nanoseconds.

Steps to complete:

  1. change S3 object pathname so the prefix is tenant/date/hour (or maybe finer-grained).
  2. change querier to list reports within a prefix time-bucket using S3 rather than DynamoDB.
  3. add switch-over date so querier uses DynamoDB index before that and S3 list after.
  4. stop collectors writing to DynamoDB.