apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.32k stars 3.21k forks source link

Doris Roadmap 2022 #7502

Closed morningman closed 7 months ago

morningman commented 2 years ago

The following is the Roadmap for the Doris community in 2022. The plan includes all aspects of code features, documentation, community building, etc. that are to be developed, have already been developed, and have been completed but require ongoing optimization.

The plan is currently under discussion, so if you have comments or suggestions on any aspect of the plan or beyond, please feel free to leave a comment or send an email to dev@doris.apache.org.

We will gradually create issues or jira for each direction of the plan to describe and track the progress in detail. Developers who wish to contribute are also welcome to create issues directly and associate with them (just leave a comment)

The directions marked (Good First Issue) in the plan are more independent modules, which are more suitable for newbie tasks or developers who are new to Doris. If you are interested in the relevant direction, please contact us at dev@doris.apache.org or under this issue, and we will provide detailed guidance, help and discussion.

The directions marked with (Q1) are the current work to be completed in the first quarter of 2022. We will update the schedule and progress of other directions gradually.

The marked (Done & Optimizing) directions are the directions that are currently completed but need continuous optimization. Such as ease of use, feature additions, and documentation additions.

We encourage developers to discuss anything in the dev mailing list, to subscribe to the mailing list please refer to How to subscribe.

Features

Performance Optimization

Stability and Observability

Testing

Functional Optimization

Deployment and Maintenance

Peripheral Ecology

Community

yiguolei commented 2 years ago

For regression test and performance test, we could follow clickhouse's test method. If it is allowed, I could do this.

yiguolei commented 2 years ago

Clang compile is already on process, see https://github.com/apache/incubator-doris/pull/7451

EmmyMiao87 commented 2 years ago

Could you please open an email to discuss Roadmap 2022 of Doris ?

yangzhg commented 2 years ago

支持parquet 文件存储格式也应该加进去吧

wangshuo128 commented 2 years ago

希望考虑跨版本升级功能。

Henry2SS commented 2 years ago

What about supporting AVRO format in LOAD function?

zbtzbtzbt commented 2 years ago

Looking forward to push based pipeline engine @morningman-cmy @yiguolei

hf200012 commented 2 years ago

Doris Manager: 1.Follow-up Doris Manager upgrade 2.User UI interaction improvement 3.Doris Manager supports Doris automated upgrade

924060929 commented 2 years ago

我们公司已经有一个回归测试框架。大体是用groovy的dsl去完成测试sql、stream load、安装tpch等功能,大概使用方式如下图。 后续可以提给社区 image

jackwener commented 2 years ago

既然后续有这么多内容,关于社区部分建一个 RFC 目录挺有必要的,大型的 PR 的 design doc 放进去,一方面是为了社区新人的快速融入,另外也减小PR review的压力

morningman commented 2 years ago

既然后续有这么多内容,关于社区部分建一个 RFC 目录挺有必要的,大型的 PR 的 design doc 放进去,一方面是为了社区新人的快速融入,另外也减小PR review的压力

好主意,你是否有一些RFC 模板可供参考?

jackwener commented 2 years ago

这是 cockroach 的 实践

caiconghui commented 2 years ago

What about supporting AVRO format in LOAD function?

7650

Henry2SS commented 2 years ago

What about supporting AVRO format in LOAD function?

7650

Thx for opening an issue.

morningman commented 2 years ago

What about supporting AVRO format in LOAD function?

7650

Thx for opening an issue.

Add to the roadmap

hf200012 commented 2 years ago

7680 Data export function supports exporting to db, kafka, etc.

hf200012 commented 2 years ago

7678 max_by, min_by aggregate function support

huligong1234 commented 2 years ago

support decimal data type for create table as select statement. (detailMessage = Unsupported type 'DECIMAL(9,0)' in create table as select statement)

morningman commented 2 years ago

7680 Data export function supports exporting to db, kafka, etc.

7678 max_by, min_by aggregate function support

Added to the Roadmap

morningman commented 2 years ago

support decimal data type for create table as select statement. (detailMessage = Unsupported type 'DECIMAL(9,0)' in create table as select statement)

Added to the roadmap

yiguolei commented 2 years ago

Could use vectorized method to optimize load process??

i7xh commented 2 years ago

Why Doris need push based query execution engine?

yiguolei commented 2 years ago

@i7xh Two example:

  1. Currently doris‘s concurrency control is based on tablet, one tablet ---> exec fragment, there is only one thread to deal with the data at query engine. If use pushed engine, could expand the computing thread num at run time.
  2. In pull engine, if one fragment contains 3 or more node, like scan--> filter--> agg there is only one node is executing, but in push based engine, node executing could be async, for example scan and agg could execute at same time.
i7xh commented 2 years ago

Provides Schemaless semantics for fast analysis of semi-structured data

Json Parsing Optimization

There are a lot of user cases schema-less or semi-structured that lead to support json optimization especially,mainly with decouple change of the schema?

i7xh commented 2 years ago
企业微信截图_a18eb192-6b6a-4615-a6a5-eeae4d0430d2

look forward imperatively

lordk911 commented 2 years ago

want to know when will vectorized query engine could be released.

kuncle commented 2 years ago

When will support data type Decimal(38,18) ?

kpfly commented 2 years ago

When will support data type Decimal(38,18) ?

This feature will be released as an experimental feature in version 1.2.0,which will be released at the end of this month.

kuncle commented 2 years ago

When will support data type Decimal(38,18) ?

This feature will be released as an experimental feature in version 1.2.0,which will be released at the end of this month.

cool, thanks.

mengzhisy commented 1 year ago

请教一下“Pipeline execution engine”是否可以更方便地实现多线程执行算法,然后大幅提升多核利用率? 拿doris测试过tpc-ds,目前对多核心的利用率貌似比较低