Use case: personal cluster usage

Want to start an issue to capture my personal use case running on my personal cluster.

I used to run Prometheus, but was getting frustrated with its longer term maintenance. Not to mention, I was severely limited by what I was able to do in the long run. Logs were limited to what was available on disk, and don't even get me started on traces... While I generally prefer wide events for my own personal use cases, the technology available didn't support what I needed.

With the LGTM/P stack, I would end up running significantly more systems (effectively managing 4 databases that all handle similar data in different ways) resulting in a significantly higher cost on operators. Further more, tiering these systems with S3 require 4x the effort.

ElasticStack was another option I could have picked from but chose not to. It's been a few years since I used the elastic stack, so I admit my knowledge is a bit dated. I have done some research to bring my knowledge up to date as I iterated on this project, so I am familiar with some more recent offerings. While I'm not as thrown by open source license changes as many others are, it often makes it harder to get others on board with new proposals.

While I could consider the OpenSearch family of things, I am far less familiar with the fork. It's on my list to research some more and get up to speed on since it's used on the AI side of things a bit, but not something I can speak to today.

Here's my related deployment configuration for my personal cluster: https://github.com/mjpitz/mjpitz/commit/325095fa6332c38ed224b5687dc0bb9fc8345581

The gist....

Cloud: DigitalOcean
Deployment: S3 Tiered (using DigitalOcean spaces)
Mechanism: Terraform driving Helm
Notes:
- There's part of the workflow that is really clunky with the way the ClickHouse chart exists today... (mostly because it was done as a shortcut).
- It took some finagling to get things going and I need to look at porting some of the improvements into the charts, but all around getting things up and running really wasn't too bad.

Output from cluster...

➜  cognative git:(main) ✗ kubectl get pods
NAME                                       READY   STATUS    RESTARTS   AGE
cognative-clickhouse-0                     1/1     Running   0          12h
cognative-cluster-agent-55979b6cd8-pzzrk   1/1     Running   0          12h
cognative-collector-6d96f9f7c6-p9fwt       1/1     Running   0          11h
cognative-grafana-7f99dcdb4c-dl8xl         1/1     Running   0          12h
cognative-node-agent-4kvvk                 1/1     Running   0          12h
cognative-node-agent-5n6s4                 1/1     Running   0          12h
cognative-node-agent-8p2s2                 1/1     Running   0          12h

mjpitz / cognative

Use case: personal cluster usage #36