cncf / sandbox

Applications for Sandbox go here! ⏳📦🧪
Apache License 2.0
119 stars 19 forks source link

[Sandbox] StarRocks #59

Open alberttwong opened 10 months ago

alberttwong commented 10 months ago

Application contact emails

albert.wong@celerdata.com, li.kang@celerdata.com, andy.ye@celerdata.com

Project Summary

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.

Project Description

The StarRocks project is an open source, distributed, MPP (Massively Parallel Processing) OLAP database that is designed for high performance and scalability.

StarRocks is needed because there is a growing demand for high-performance OLAP databases that can handle large amounts of data. Traditional OLAP databases, such as Snowflake, Microsoft SQL Server, GCP BigQuery and AWS Redshift, are not designed to handle the massive amounts of data that are generated by today's applications in real time. StarRocks is designed to address this challenge by providing high performance and scalability for large-scale real time data analytics.

Here are some of the key features of StarRocks:

The StarRocks project is still under development, but it has already been adopted by a number of organizations, including AirBnb, Alibaba, Tencent, and JD.com. It is a promising new OLAP database that has the potential to revolutionize the way we analyze data.

Org repo URL (provide if all repos under the org are in scope of the application)

https://github.com/StarRocks/

Project repo URL in scope of application

https://github.com/StarRocks/starrocks

Additional repos in scope of the application

No response

Website URL

https://www.starrocks.io/

Roadmap

https://github.com/StarRocks/starrocks/issues/16445

Roadmap context

No response

Contributing Guide

https://github.com/StarRocks/starrocks/blob/main/CONTRIBUTING.md

Code of Conduct (CoC)

https://github.com/StarRocks/starrocks/blob/main/CODE_OF_CONDUCT.md

Adopters

No response

Contributing or Sponsoring Org

https://celerdata.com/

Maintainers file

https://github.com/StarRocks/starrocks/blob/main/community/membership.md

IP Policy

Trademark and accounts

Why CNCF?

StarRocks can contribute to the CNCF in a number of ways, including:

Being in the CNCF would provide the StarRocks project with a number of benefits, including:

Overall, StarRocks can contribute to the CNCF and benefit from being a member of the CNCF in a number of ways. This can help to make StarRocks a more successful project and make it more accessible to developers.

Benefit to the Landscape

This would provide another option for real time analytics OLAP database in the marketplace.

Cloud Native 'Fit'

We see ourselves as a cloud native database to deployed in k8s and other cloud technologies.

Cloud Native 'Integration'

k8s, containers and all the other projects in the k8s community.

Cloud Native Overlap

No response

Similar projects

https://landscape.cncf.io/card-mode?category=database&grouping=category

Landscape

Yes at https://landscape.cncf.io/card-mode?category=database&grouping=category&selected=star-rocks

Business Product or Service to Project separation

As a Linux Foundation project, we have already done the work to separate the community products from the sponsoring company.

Project presentations

No response

Project champions

No response

Additional information

No response

jberkus commented 6 months ago

Given the name, can I assume that the base storage is RocksDB?

alberttwong commented 6 months ago

Given the name, can I assume that the base storage is RocksDB?

No. It's our own storage. No code from RocksDB

jberkus commented 6 months ago

When I look at StarRocks, I see an analytics database that's definitely useful in a cloud-native context, but which isn't itself cloud-native, any more than, say, CitusDB or Snowflake would be. Can you explain what makes StarRocks cloud native? There's sections for that in the application, but you haven't filled them out.

alberttwong commented 6 months ago

Before I get to the explain part, how do you define cloud native?

TheFoxAtWork commented 6 months ago

The CNCF TOC maintains the definition of cloud native for the CNCF in our Repository: https://github.com/cncf/toc/blob/main/DEFINITION.md

alberttwong commented 6 months ago

When I look at StarRocks, I see an analytics database that's definitely useful in a cloud-native context, but which isn't itself cloud-native, any more than, say, CitusDB or Snowflake would be. Can you explain what makes StarRocks cloud native? There's sections for that in the application, but you haven't filled them out.

Based on cloud native definition, we have a k8s operator that you can deploy our solution anywhere k8s and containers are supported.

alberttwong commented 6 months ago

@jberkus @TheFoxAtWork I was looking at the CNCF review schedule at https://github.com/orgs/cncf/projects/27 and https://github.com/orgs/cncf/projects/14. Do you have an estimate on when will we be reviewed and is there any way for us to expedite the process?

amye commented 6 months ago

I was looking at the CNCF review schedule at https://github.com/orgs/cncf/projects/27 and https://github.com/orgs/cncf/projects/14. Do you have an estimate on when will we be reviewed and is there any way for us to expedite the process? Next review is January 23rd, we'll move projects that are coming up for review then. As we pause on all projects moving levels or coming in six weeks before Kubecon, April 9th is the following review meeting. That's truly as fast as we can do these!

nikhita commented 5 months ago

xref - https://github.com/cncf/toc/issues/889 (since this is already a LF project)

rochaporto commented 2 months ago

Was looking for a reference, did we get a presentation of the project to a TAG? TAG-Storage? @xing-yang

xing-yang commented 2 months ago

@rochaporto We have not seen a presentation from StarRocks at TAG-Storage yet. @alberttwong let us know when you want to present. TAG-Storage meets on the 2nd and 4th Wednesday of every month at 8AM PT. Thanks.

alberttwong commented 2 months ago

@xing-yang no one told us of this requirement. What is TAG-Storage and what type of presentation do you need to have? Can you contact me at albert.wong@celerdata.com

amye commented 2 months ago

Presenting to a TAG is optional for sandbox projects, it provides a better perspective for the TOC. Given as this is scheduled for review next week, best to have the TAG Storage folks ask questions in here instead of trying to schedule a presention. @xing-yang @chira001 for awareness

xing-yang commented 2 months ago

@alberttwong I have sent out an email to you with more details, but I just saw the message from @amye.

alberttwong commented 2 months ago

@amye do you know what we need to prepare for our April 9th meeting? I haven't seen an invite (detail to join and time) or what we need to prepare.

amye commented 2 months ago

The TOC meets in a closed meeting to review and discuss, the recording will be published on the CNCF's YouTube channel. Votes open after that meeting, they close a week later, on April 16th. Nothing the projects need to do.

kevin-wangzefeng commented 2 months ago

Curious about the trademark status of StarRocks: According to the LICENSE file, it is StarRocks, Inc. holding the copyright of this project. What is the status of StarRocks, Inc.? Though this is already a LF project, I can't find it in https://www.linuxfoundation.org/legal/trademarks

Ref: https://github.com/StarRocks/starrocks/blob/35f61f77c21a37dc8347c5923501918bb0170667/LICENSE.txt#L1C1-L2C1

alberttwong commented 2 months ago

What is the status of StarRocks, Inc.

I'll have to ask ktan@linuxfoundation.org our contact at LF.

alberttwong commented 2 months ago

@kevin-wangzefeng I have no response from kristi (ktan@). Do you know anyone else at LF that we could contact?

jeefy commented 2 months ago

Is this just an artifact that needs to be removed from the license since this project is already under the LF?

Also Kristi no longer works for the LF, so your emails probably bounced. :(

alberttwong commented 2 months ago

@jeefy do you know who is Kristi's replacement? I also emailed tdolezal@linuxfoundation.org and dpalilonis@linuxfoundation.org and I've gotten no reply.

alberttwong commented 2 months ago

@kevin-wangzefeng @jeefy I just got a reply.

Hi, Albert, you are correct. Just tell CNCF that the name is held by LF Projects, LLC and ask them to connect with me if they have any questions. Thank you,

alberttwong commented 2 months ago

@amye has the video of the meeting been published?

amye commented 2 months ago

https://youtu.be/h63Sg_qDQT8?si=hcknyYV7vSFyom8l - it's in the TOC playlist on the CNCF YouTube channel

alberttwong commented 2 months ago

I saw 2 concerns in the meeting.

  1. Licensing. I believe this is resolved through my cut and paste of my communication with Scott A. Nicholas @ LF
  2. TAG review. I have that scheduled but can you modify the application process so that future projects know that they have go through a TAG review?
alberttwong commented 2 months ago

@mauilion Can you confirm receipt and resolution of the legal concern via my communication with Scott A. Nicholas @ LF

jberkus commented 2 months ago

If StarRocks already belongs to the Linux Foundation, then what's the motive to move it into the CNCF? It's not really cloud native software.

creatstar commented 2 months ago

@jberkus Thank you for asking. StarRocks is a cloud-native analytical database.

  1. StarRocks' architecture of separating storage and computation supports storing data in various types of cloud service object storage.
  2. StarRocks supports direct queries on the most popular open data lakes today, and it is compatible with cloud catalog services like AWS Glue.
  3. StarRocks supports deployment and management using the K8S connector.

StarRocks' motivation for joining CNCF is to collaborate with other CNCF projects and better leverage the various features of the cloud, providing a superior data analytics experience for enterprise customers

jberkus commented 2 months ago

@creatstar

I'm not sure you that you understand what we mean when we say "cloud native". Particularly, we're looking for integration into a stack of open source container-based microservice platforms -- not compatibility with AWS. As far as I can tell, StarRocks doesn't support deployment on Kubernetes and it doesn't integrate with other CNCF projects.

StarRocks is a great database, and I'm really glad that it's under LF stewardship. But it's a very poor fit for the CNCF, and joining would just make both the CNCF folks and the StarRocks folks frustrated. (this is my personal opinion as a database geek; I am not a member of the TOC and do not speak for them)

alberttwong commented 2 months ago

@jberkus Maybe I didn't make it clear. When we say that we have an operator, it is a kubernetes operator. That means that it supports deployment on kubernetes and relies kubernetes network and the horizontal scaling of containers is managed by kubernetes and request storage from kubernetes PV/PVCs.

TheFoxAtWork commented 2 weeks ago

@xing-yang Was the TAG able to discuss the project sufficiently to form a recommendation for the TOC?

chira001 commented 2 weeks ago

@xing-yang Was the TAG able to discuss the project sufficiently to form a recommendation for the TOC?

StarRocks has presented to the TAG, and following the presentation is assessing their roadmap and have put the application on pause for now.

TheFoxAtWork commented 2 weeks ago

Thank you @chira001 !

@mrbobbytables lets move this to postponed until the project is ready to re-engage.

@alberttwong please let us know when the project anticipates when it is ready to re-engage, this will let us keep track of it for reconsideration

xing-yang commented 1 week ago

Here's a recording of the StarRocks presentation: https://www.youtube.com/watch?v=zI0mjxqFoCA By the way, Sida Shen sida.shen@celerdata.com gave the presentation. Tyler Wishnoff tyler.wishnoff@celerdata.com and Sida have been communicating with TAG Storage.

ss892714028 commented 1 week ago

@TheFoxAtWork @xing-yang Thank you for uploading the recording! This is Sida Shen (sida.shen@celerdata.com) from CelerData, I will be the main contact from StarRocks' side going forward.