turms-im / turms

🕊️ The world's most advanced open source instant messaging engine for 100K~10M concurrent users https://turms-im.github.io/docs
Apache License 2.0
1.74k stars 270 forks source link

Add support deploying optional services (e.g. MinIO) in docker-compose.standalone.yml for testing purposes only #728

Open JamesChenX opened 3 years ago

JamesChenX commented 3 years ago

The reasons for introducing Loki instead of other logging services:

  1. First of all, our users should always use the logging services provided by cloud service providers (AWS CloudWatch, Aliyun SLS, etc) in prod unless they believe they can do a better job than them or they need some custom features that cloud service providers cannot provide. Because the logging services support a lot of features (e.g. collect, analyze, query, graph, report, alert, etc), just copy some configs and click, and a well-featured logging service just set up for the whole system. Our users don't need to care about their implementations.

  2. Turms servers provide several kinds of logs, some logs like system logs are usually just used to troubleshoot or measure the capability of servers to scale up or down, etc. But the logs of client APIs are very valuable for user behavior analysis, usually considered as an asset of a company, so a wide column OLAP database with LSM supported is usually a good choice (e.g. Clickhouse. We don't use ELK or Graylog (uses Elasticsearch) because we don't need full-text indexes). But it's not an easy task for users to set up a robust and well-featured system to analyze these logs (it can be an independent project indeed). And we won't analyze logs for our users because (1) is our recommended solution and we don't want to force our users to adopt any solution (e.g. Clickhouse, Cassandra, EFK, Graylog, and whatever) just to implement log analysis because if they can buy database services for log analysis, why not just buy a well-featured and mature logging service?

  3. As a result, most of our users can just adopt the plan (1), and the users who need to implement custom analysis logic can just adopt the plan (2). We don't need to support for users to analyze logs further (we just make sure that we have enough and useful raw log data). And we only provide logging service support for the testing env currently. Considering the cases above, we use Loki because it's lightweight enough and works well with Grafana (Note that Loki is not designed/suitable for log/business analysis).

PS: If users don't need log analysis, they can adopt Loki in prod indeed. But the problem is: There is no mature company that won't do log analysis, and if they can buy Loki service, why not just buy logging service? So a logging service is highly recommended in prod.

Ref: