apache / skywalking

APM, Application Performance Monitoring System
https://skywalking.apache.org/
Apache License 2.0
23.72k stars 6.5k forks source link

Discussion about clickhouse support in the OAP #11924

Closed sglztc closed 7 months ago

sglztc commented 7 months ago

Search before asking

Apache SkyWalking Component

OAP server (apache/skywalking)

What happened

Because the company uses clickhouse for storage, can skywalking support clickhouse? The performance of clickhouse is relatively good. In the past few days, it has been quite a lot of work to transform the storage into clickhouse. Is there any recommendation on the official website for transformation, or allow other people to merge it. It's also okay to go to a specific branch for reference. There is currently nothing in this area. Regarding storage, clickhouse is used.

What you expected to happen

I very much hope that the official can support clickhouse. This is also the voice of many people. Most observable projects currently use clickhouse, or the official allows others to merge it into a specific branch for reference. Currently, there is nothing about using clickhouse for storage.

How to reproduce

I very much hope that the official can support clickhouse. This is also the voice of many people. Most observable projects currently use clickhouse, or the official allows others to merge it into a specific branch for reference. Currently, there is nothing about using clickhouse for storage.

Anything else

No response

Are you willing to submit a pull request to fix on your own?

Code of Conduct

wu-sheng commented 7 months ago

There is no official team to do this. This is very clear for years. We have decided to move on our own database, BanyanDB.

If you want to do something, go ahead, and contribute to help others.

wu-sheng commented 7 months ago

We never rejected a pull request about it. No one takes the maintenance responsibility. No pull request happens.

cxntsh commented 7 months ago

We have initially implemented some ideas. If you want to have further communication, you can send me an email.

wu-sheng commented 7 months ago

@sglztc Your comments have been deleted, all discussions on GitHub need to be in English. Slack CN channel is for Chinese.

sglztc commented 6 months ago

We have initially implemented some ideas. If you want to have further communication, you can send me an email.

Already contacted, the commercial version requires a fee

sglztc commented 6 months ago

What is the specific reason why clickhouse storage is not supported? Are there any pitfalls in this area so that I can avoid it when designing? Very much looking forward to the author's answer.

sglztc commented 6 months ago

We have initially implemented some ideas. If you want to have further communication, you can send me an email.

Can you share the table design of this piece? We can implement the code ourselves.

wu-sheng commented 6 months ago

What is the specific reason why clickhouse storage is not supported? Are there any pitfalls in this area so that I can avoid it when designing? Very much looking forward to the author's answer.

I think I have said this very clearly. There isn't a thing about why. No one is actually contributing and maintaining this in the upstream. Do you know who wants to do this? You said we rejected something, but we didn't.

sglztc commented 6 months ago

What is the specific reason why clickhouse storage is not supported? Are there any pitfalls in this area so that I can avoid it when designing? Very much looking forward to the author's answer.

I think I have said this very clearly. There isn't a thing about why. No one is actually contributing and maintaining this in the upstream. Do you know who wants to do this? You said we rejected something, but we didn't.

Hello, teacher wu-sheng, you have misunderstood me. What I want to know is are there any pitfalls in the process of adapting to clickhouse? I would like to know some specific reasons why clickhouse is not supported, because we are currently doing research and implementation in this area to avoid taking some detours. Do I only need to implement storage-clickhouse-plugin to use clickhouse storage? Do you still need to do a deeper transformation?

wu-sheng commented 6 months ago

What I want to know is are there any pitfalls in the process of adapting to clickhouse?

I don't know that.

I would like to know some specific reasons why clickhouse is not supported, because we are currently doing research and implementation in this area to avoid taking some detours

The only reason is what I mentioned. No one shows up to say they will implement it and maintain the implementation to make sure it is product ready.

Do I only need to implement storage-clickhouse-plugin to use clickhouse storage? Do you still need to do a deeper transformation?

I have no idea about the clickhouse. From the design perspective, SkyWalking exposes the module APIs for making implementation easier. So, yes, we hope you only need to implement that. The reality is on the other side, sometimes, you will need annotations or some advanced flags on the kernel level to make the performance good. You can see Elasticsearch and JDBC relative flags are in the core module codes.


So, you can see, every time, clickhouse relative discussions fall into the same hole. People want me or someone in the SkyWalking maintenance team to tell, but the truth is we as a group neither know the clickhouse, nor use it. So, there is no magic and short path. You have to understand both SkyWalking OAP kernel/details and the features of ClickHouse, and write design, coding and run performance tests to prove that this new storage option really is good.

wu-sheng commented 6 months ago

The existing offering is being kept private and commercial only, which also shows up, it would not be that easy. So, you have to make your decision. I could only join the discussion if you are going to donate this to upstream and have at least two committers taking responsibility to maintain them, once it lacks maintenance, it will be removed like IoTDB and InfluxDB storage options. That is how open source community works.

sglztc commented 6 months ago

What I want to know is are there any pitfalls in the process of adapting to clickhouse?

I don't know that.

I would like to know some specific reasons why clickhouse is not supported, because we are currently doing research and implementation in this area to avoid taking some detours

The only reason is what I mentioned. No one shows up to say they will implement it and maintain the implementation to make sure it is product ready.

Do I only need to implement storage-clickhouse-plugin to use clickhouse storage? Do you still need to do a deeper transformation?

I have no idea about the clickhouse. From the design perspective, SkyWalking exposes the module APIs for making implementation easier. So, yes, we hope you only need to implement that. The reality is on the other side, sometimes, you will need annotations or some advanced flags on the kernel level to make the performance good. You can see Elasticsearch and JDBC relative flags are in the core module codes.

So, you can see, every time, clickhouse relative discussions fall into the same hole. People want me or someone in the SkyWalking maintenance team to tell, but the truth is we as a group neither know the clickhouse, nor use it. So, there is no magic and short path. You have to understand both SkyWalking OAP kernel/details and the features of ClickHouse, and write design, coding and run performance tests to prove that this new storage option really is good.

thanks

sglztc commented 6 months ago

By analyzing the storage source code of jdbc and elasticsearch, we can find that it is feasible to use clickhouse for storage. We can learn from elasticsearch storage and use clickhouse large-width table design to give full play to the multi-column advantages of clickhouse. It is still under continuous development. I hope it will give some inspiration to developers who are using clickhouse as storage.Personal suggestions are welcome to discuss together.

wu-sheng commented 6 months ago

Once you have a proposal about implementation details, please go for discussion to explain your design. If you want further to make this in upstream(if accepted), you need to prepare SWIP.