slackhq / astra

Astra is a structured log search and analytics engine developed by Slack and Salesforce
https://slackhq.github.io/astra/
MIT License
213 stars 30 forks source link

Add optional S3 streaming for cache nodes #1079

Closed bryanlb closed 1 month ago

bryanlb commented 1 month ago

Summary

Adds an experimental flag for the cache nodes, that optionally streams in the data from object storage. This removes the requirement for disk for cache nodes, enabling switching to significantly cheaper storage types.

Current implementation still results in the assignments still being "downloaded" and "initialized," but transparent to Lucene this is fetching with block-level caches into memory at the time of request.

To use set:

-Dastra.s3Streaming.enabled=true

optionally:

-Dastra.s3Streaming.pageSize=2097152