cncf / sandbox

Applications for Sandbox go here! ⏳📦🧪
Apache License 2.0
131 stars 21 forks source link

[Sandbox] openGemini #82

Closed xiangyu5632 closed 2 months ago

xiangyu5632 commented 8 months ago

Application contact emails

xiangyu9@huawei.com xuran215@huawei.com

Project Summary

The open-source, cloud-native, distributed time series database

Project Description

What Is Time Series Data?

Time-series data or temporal data is a sequence of data points collected over time intervals, allowing us to track changes over time. Time-series data can track changes over milliseconds, days, or even years. For example, industrial sensor data, microservice logs, trace, and metric data are time series data. OpenGemini is an open-source time series database that focuses on storing and analyzing time series data.

What are the advantages of openGemini?

In fields such as the Internet of Things (IoT) and cloud computing, there is a large amount of time series data. Data is written at the GB level per second. The number of metrics converted to more than 10 million metrics per second. Generally, the data query latency is at the millisecond level, which cannot be met by most open-source time series databases. OpenGemini focuses on the storage and analysis of massive time series data. Advantages:

  1. Open source
  2. Distributed architecture
  3. Compared with other open-source time series databases, openGemini provides high write and query performance, better memory resource control, and a higher data compression rate.

Org repo URL (provide if all repos under the org are in scope of the application)

https://github.com/openGemini

Project repo URL in scope of application

https://github.com/openGemini/openGemini

Additional repos in scope of the application

https://github.com/openGemini/openGemini-operator https://github.com/openGemini/openGemini.github.io https://github.com/openGemini/website https://github.com/openGemini/opengemini-client-go https://github.com/openGemini/data-migration-tools https://github.com/openGemini/grafana-opengemini-datasource https://github.com/openGemini/openGemini-dashboard https://github.com/openGemini/gemix https://github.com/openGemini/opengemini-client-java

Website URL

https://opengemini.org

Roadmap

https://github.com/openGemini/openGemini/blob/main/ROADMAP.md

Contributing Guide

https://github.com/openGemini/openGemini/blob/main/CONTRIBUTION.md

Code of Conduct (CoC)

https://github.com/openGemini/openGemini/blob/main/CODE_OF_CONDUCT.md

Adopters

https://github.com/openGemini/openGemini/blob/main/ADOPTERS.md

Contributing or Sponsoring Org

Huawei

Maintainers file

https://github.com/openGemini/openGemini/blob/main/MAINTAINERS.md

IP Policy

Trademark and accounts

Why CNCF?

The creation and philosophy of CNCF are closely linked to the open-source spirit, which is dedicated to building and promoting open source cloud native technologies and ecosystems. Before choosing CNCF, we noted the following:

  1. Prometheus prefers the monitoring system in design. Its built-in time series database cannot store a large amount of time series data. Generally, it needs to use a third-party time series database to complete the storage.
  2. KubeEdge plans to store edge device data inside the platform. The requirement for time series databases is long-term.
  3. OpenTelemetry leads to the standard of observability. However, openTelemetry does not have a unified backend storage. metrics, logs, and traces are stored in different systems, and multiple types of correlation analysis cannot be completed in the database.

These projects need time series databases, and we believe that the addition of openGemini can fill this ecological gap. As a time series database, openGemini will provide better functionality, performance, and scalability to better meet the needs of these projects. OpenGemini uses the MPP distributed architecture and delivers outstanding data write and query performance in DevOps and IoT scenarios. Compared with similar open-source time series databases, such as InfluxDB, IoTDB, and openTSDB, openGemini has better performance and lower data storage costs. Compared with traditional relational databases, its data storage cost is only 1/20. In massive time series data scenarios, its write and query performance is improved by more than 10 times. This has enabled it to be widely used by more than 20 community users in areas such as power, IoT, industrial manufacturing, observability, and O&M monitoring.

Our goal is to promote technological innovation in cloud-native open-source time series databases, reduce storage costs of massive time series data, simplify system architecture, improve storage and analysis efficiency of time series data, and strengthen integration with other cloud-native projects to make time-series data storage and analysis more convenient.

We hope to join the CNCF community and become a part of the global cloud-native developer community. Given CNCF's broad user base, using CNCF's platform will enable openGemini to benefit more and more organizations and companies.

Benefit to the Landscape

According to DB-Engines data, time series databases have become the fastest-growing database type in the past few years. Due to the rapid development of 5G, Internet of Things (IoT), and cloud native technologies, a large number of time series data storage requirements have arisen.

The benefits could be:

  1. The participation of openGemini also enriches the database types of CNCF landscape, attracting a more extensive community of developers and users.
  2. The emergence of openGemini promotes closer integration of time series databases with CNCF's projects such as Kubernetes, Prometheus, KubeEdge, and openTelemetry, and promotes further development in multiple fields.
  3. openGemini has excellent read-and-write performance, which can meet the requirements of many application scenarios. The high cardinality problem of time series data is also well solved.

Cloud Native 'Fit'

Landscape: Databases openGemini, as a time series database, will store time series data. It provides cloud-native features such as high performance, high reliability, scalability, and observability. So it fits in "databases".

TAGs: TAG Storage The participation of openGemini in tag-storage group will raise discussions about the integration of K8s, KubeEdge, Prometheus and openTelementry. Based on the characteristics of time series databases, we will further share and discuss the basic characteristics of time series databases in terms of availability, scalability, performance, durability, consistency, ease of use, cost and operational complexity.

Cloud Native 'Integration'

N/A

Cloud Native Overlap

N/A

Similar projects

InfluxDB Apache IoTDB timescledb openTSDB

Landscape

Yes

Business Product or Service to Project separation

N/A

Project presentations

N/A

Project champions

N/A

Additional information

No response

xing-yang commented 6 months ago

Thanks @xiangyu5632 and Ran Xu for presenting openGemini at TAG Storage meeting today! Here's the meeting recording: https://www.youtube.com/watch?v=7MB170knbqs. Here are the slides: https://drive.google.com/file/d/1KqKCdrD0P9BlugUONnj9dZOiHw_AQoxr/view?usp=sharing cc @chira001 @Raffaele Spazzoli

xiangyu5632 commented 6 months ago

Yeah, I'm very glad to have this opportunity to introduce our project to everyone.

jberkus commented 4 months ago

TAG-CS review, this project has:

xiangyu5632 commented 3 months ago

hi @jberkus we have a governance document in openGemini's community repository and have an update referring to CNCF governance-maintainer.md template you can see: https://github.com/openGemini/community/blob/main/GOVERNANCE.md

jberkus commented 3 months ago

Update:

mrbobbytables commented 3 months ago

Follow-up from today's sandbox review, OpenGemini will be moved to a vote 👍 Just an FYI though - there may be a follow up regarding the project's name and trademark concerns. /vote

git-vote[bot] commented 3 months ago

Vote created

@mrbobbytables has called for a vote on [Sandbox] openGemini (#82).

The members of the following teams have binding votes: Team
@cncf/cncf-toc

Non-binding votes are also appreciated as a sign of support!

How to vote

You can cast your vote by reacting to this comment. The following reactions are supported:

In favor Against Abstain
👍 👎 👀

Please note that voting for multiple options is not allowed and those votes won't be counted.

The vote will be open for 2months 30days 2h 52m 48s. It will pass if at least 66% of the users with binding votes vote In favor 👍. Once it's closed, results will be published here as a new comment.

kevin-wangzefeng commented 3 months ago

I will be abstaining due to a conflict of interest.

mrbobbytables commented 3 months ago

/check-vote

git-vote[bot] commented 3 months ago

Vote status

So far 18.18% of the users with binding vote are in favor (passing threshold: 66%).

Summary

In favor Against Abstain Not voted
2 0 0 9

Binding votes (2)

User Vote Timestamp
rochaporto In favor 2024-06-12 9:11:47.0 +00:00:00
TheFoxAtWork In favor 2024-06-12 21:00:46.0 +00:00:00
@dims Pending
@angellk Pending
@mauilion Pending
@linsun Pending
@dzolotusky Pending
@kevin-wangzefeng Pending
@cathyhongzhang Pending
@nikhita Pending
@kgamanji Pending

Non-binding votes (3)

| User | Vote | Timestamp | | ---- | :---: | :-------: | | huang-feiteng | In favor | 2024-06-12 2:53:32.0 +00:00:00 | | pacoxu | In favor | 2024-06-12 9:36:03.0 +00:00:00 | | chira001 | In favor | 2024-06-12 14:25:09.0 +00:00:00 |
mrbobbytables commented 3 months ago

/check-vote

git-vote[bot] commented 3 months ago

Vote status

So far 90.91% of the users with binding vote are in favor (passing threshold: 66%).

Summary

In favor Against Abstain Not voted
10 0 0 1

Binding votes (10)

User Vote Timestamp
kevin-wangzefeng In favor 2024-06-18 4:10:45.0 +00:00:00
dims In favor 2024-06-18 14:04:42.0 +00:00:00
cathyhongzhang In favor 2024-06-17 18:31:01.0 +00:00:00
TheFoxAtWork In favor 2024-06-12 21:00:46.0 +00:00:00
nikhita In favor 2024-06-18 4:34:02.0 +00:00:00
rochaporto In favor 2024-06-12 9:11:47.0 +00:00:00
linsun In favor 2024-06-18 15:21:23.0 +00:00:00
kgamanji In favor 2024-06-18 6:39:51.0 +00:00:00
dzolotusky In favor 2024-06-18 4:09:53.0 +00:00:00
angellk In favor 2024-06-18 13:10:33.0 +00:00:00
@mauilion Pending

Non-binding votes (3)

| User | Vote | Timestamp | | ---- | :---: | :-------: | | huang-feiteng | In favor | 2024-06-12 2:53:32.0 +00:00:00 | | pacoxu | In favor | 2024-06-12 9:36:03.0 +00:00:00 | | chira001 | In favor | 2024-06-12 14:25:09.0 +00:00:00 |
kevin-wangzefeng commented 3 months ago

/check-vote

git-vote[bot] commented 3 months ago

Votes can only be checked once a day.

git-vote[bot] commented 3 months ago

Vote closed

The vote passed! 🎉

81.82% of the users with binding vote were in favor (passing threshold: 66%).

Summary

In favor Against Abstain Not voted
9 0 1 1

Binding votes (10)

User Vote Timestamp
@cathyhongzhang In favor 2024-06-17 18:31:01.0 +00:00:00
@nikhita In favor 2024-06-18 4:34:02.0 +00:00:00
@kgamanji In favor 2024-06-18 6:39:51.0 +00:00:00
@TheFoxAtWork In favor 2024-06-12 21:00:46.0 +00:00:00
@angellk In favor 2024-06-18 13:10:33.0 +00:00:00
@dzolotusky In favor 2024-06-18 4:09:53.0 +00:00:00
@kevin-wangzefeng Abstain 2024-06-19 3:36:45.0 +00:00:00
@linsun In favor 2024-06-18 15:21:23.0 +00:00:00
@dims In favor 2024-06-18 14:04:42.0 +00:00:00
@rochaporto In favor 2024-06-12 9:11:47.0 +00:00:00

Non-binding votes (3)

| User | Vote | Timestamp | | ---- | :---: | :-------: | | @huang-feiteng | In favor | 2024-06-12 2:53:32.0 +00:00:00 | | @pacoxu | In favor | 2024-06-12 9:36:03.0 +00:00:00 | | @chira001 | In favor | 2024-06-12 14:25:09.0 +00:00:00 |
Cmierly commented 2 months ago

Hello and congrats on being accepted as a CNCF Sandbox project!

Here is the link to your onboarding task list: https://github.com/cncf/sandbox/issues/137

Feel free to reach out with any questions you might have!