SigNoz / logs-benchmark

Logs performance benchmark repo: Comparing Elastic, Loki and SigNoz
https://signoz.io/blog/logs-performance-benchmark/
80 stars 3 forks source link

Loki is not designed for high cardinality indexes #1

Open stevehipwell opened 1 year ago

stevehipwell commented 1 year ago

I'm not sure if there is much point adding Loki to a benchmark for high cardinality indexes as that explicitly goes against what it's designed for.

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate. It does not index the contents of the logs, but rather a set of labels for each log stream.

Also if you want to feed Loki with a tool other than Promtail (which I would recommend) you'd be strongly advised to use Fluent Bit with it's native Loki plugin and not Fluentd.

wardbekker commented 1 year ago

+1 on @stevehipwell comment. If you are going to include Loki, please use it in the way it's intended. Happy to give tips on more representitive testing.

Shameless plug: See also Effective troubleshooting with Grafana Loki - query basics https://www.youtube.com/watch?v=UiiZ463lcVA

danpoltawski commented 1 year ago

Another +1 to @stevehipwell comment. The whole Loki architecture is built upon being fast to query and not needing to index. Really misleading to create this 'benchmark'

Would be more interesting to compare a real use case of querying

ankitnayan commented 1 year ago

I failed to find any resources on the capacity planning of loki and the performance of different types of queries. I see a closed issue at https://github.com/grafana/loki/issues/3182. It would be great if someone can publish docs on resources needed. If someone is willing to run this on a generic dataset, we would be happy to collaborate.

stevehipwell commented 1 year ago

@ankitnayan if you don't understand how something works then how can you include it in a benchmark let alone make that benchmark fair?

ankitnayan commented 1 year ago

@stevehipwell I would appreciate some efforts in ingesting and filtering logs in Loki over a dataset rather than asking to move Loki out of comparison. I really thought about it multiple times before posting the blog. I wanted to encourage collaboration on a good dataset and bring out the best configs from all the log management tools.

We also shared that Loki did not run successfully for even low-cardinality data. Quoting the query from the blog

logs count with method GET per 10 min over the entire data (low cardinality timeseries)

Get first 100 logs with method GET (logs based on filters)

If you think the users don't use Loki for the queries mentioned in the blog. Let me know the right set of queries to benchmark it against. If you think it is supposed to perform poorly in some scenarios, share the use cases. We have already mentioned the high-cardinality usecase.

Also please do share resource usage and perf numbers and scale of your setup.

I am here to improve the stated results in the benchmark for the overall benefit of the users.

stevehipwell commented 1 year ago

@ankitnayan did you reach out to the Loki team at Grafana then before creating your benchmark if you didn't understand how to tune Loki correctly? Or for that matter to even understand what Loki is and what it isn't?

This is a prime example of how not to benchmark; a lack of understanding of the systems at test, comparing apples to oranges, and arbitrary tests which are unlikely to be actual real world workflows.

ankitnayan commented 1 year ago

I am sensing rage rather than contribution. No worries, all PRs to the setup and discussions around perf and setup rather than texts are welcome.

stevehipwell commented 1 year ago

@ankitnayan not rage, more incredulity. Instead of holding your hand up over this and either removing Loki until you understand how it could be compared or reaching out to Grafana you're asking me, a random person, to help you with your benchmark.

ankitnayan commented 1 year ago

Done. I will wait for a reply before reaching out at non-public channels https://community.grafana.com/t/how-to-tune-loki-for-better-performance-in-a-benchmark/80709

ankitnayan commented 1 year ago

@stevehipwell we got a reply from Grafana on the above link.

Screenshot 2023-02-06 at 12 18 12 PM

vaskozl commented 1 year ago

Does loki OOM under the benchmark?