[Merged by Bors] - Improve water-level demo

sbernauer commented 2 years ago

Description

Run with stackablectl --additional-demos-file demos/demos-v1.yaml --additional-stacks-file stacks/stacks-v1.yaml demo install nifi-kafka-druid-water-level-data

Tested demo with 2.500.000.000 records

Hi all, here a short summary of the observations of the water-level demo:

NiFi uses content-repo pvc but keeps it at ~50% usage => Shoud be fine forever Actions:

Increase content-repo 5->10 gb, better safe than sorry. I was able to crash it by using large queues and stalling processors.

Kafka uses pvc (currently 15gb) => Should work fine for ~1 week Actions:

Look into retentions settings (low priority as it should work ~1 week) so that it works forever

Druid uses S3 for deep storage (S3 has 15gb). But currently it also cashes everything locally at the historical because we set druid.segmentCache.locations=[{"path"\:"/stackable/var/druid/segment-cache","maxSize"\:"300g"}] (hardcoded in https://github.com/stackabletech/druid-operator/blob/45525033f5f3f52e0997a9b4d79ebe9090e9e0a0/deploy/config-spec/properties.yaml#L725) This does not really effect the demo, as 100.000.000 records (let's call it data of ~1 week) have ~400MB. I think the main problem with the demo is that queries take > 5 minutes to complete and Superset shows timeouts. The historical pod suspiciously uses exactly one core of cpu and the queries are really slow for a "big data" system IMHO. This could be because either druid is only using a single core or because we dont set any resources (yet!) and the node does not have more cores available. Going to reasearch that. Actions:

Created https://github.com/stackabletech/druid-operator/issues/306
In the meantime configure overwrite in the demo druid.segmentCache.locations=[{"path"\:"/stackable/var/druid/segment-cache","maxSize"\:"3g","freeSpacePercent":"5.0"}]
Research slow query performance
Have a look at the queries the Superset Dashboard executes and optimize them
Maybe we should bump the druid-operator versions in the demo (e.g. create release 22.09-druid which basically is 22.09 with a newer druid-op version). Therefore we get stable resources.
Enable Druid auto compaction to reduce number of segments

Review Checklist

[ ] Code contains useful comments
[ ] (Integration-)Test cases added (or not applicable)
[ ] Documentation added (or not applicable)
[ ] Changelog updated (or not applicable)
[ ] Cargo.toml only contains references to git tags (not specific commits or branches)

Once the review is done, comment bors r+ (or bors merge) to merge. Further information

sbernauer commented 2 years ago

bors r+

bors[bot] commented 2 years ago

Pull request successfully merged into main.

Build succeeded:

Run Rustfmt

stackabletech / stackablectl

[Merged by Bors] - Improve water-level demo #126

Description

Review Checklist