pingcap / community

TiDB community content
Apache License 2.0
260 stars 151 forks source link

incubating program:ServerlessDB for HTAP #483

Open xuegang opened 3 years ago

xuegang commented 3 years ago

Describe the feature or project you want to incubate: Summary

we want to provide serverless db services based on TIDB on the cloud, focusing on how to dynamically scale up and down the compute storage nodes based on business load changes to achieve zero user perception. To ensure that the database service process, always maintain the best match between business load and background resources, thus helping users to maximize cost saving

Motivation

While TIDB offers cloud services, there are a number of issues.

  1. When users order, they need to select the compute node and storage node specifications, and it is difficult for them to choose the right specifications, either by choosing smaller or larger ones, or the business load simply cannot be evaluated, resulting in users never being able to choose the right specifications.
  2. After the business load rises, you need to manually determine when to expand capacity, what resources to expand, and how much to expand. In practice, it is difficult to respond to the scenario of extremely rapid load changes in a timely manner, thus causing business performance fluctuations.
  3. After the business load drops, you need to manually judge when to shrink the capacity, what resources to shrink, and how much to shrink. If the user makes a wrong judgment, it will cause business performance fluctuations.
  4. If the business load changes very frequently, the manual implementation of expansion and shrinkage work is very burdensome. If the system is not expanded, the business performance will be degraded, and if the system is not scaled down, the resources will be wasted.
  5. It is difficult to achieve zero user awareness when scaling up or down. In case of connection pooling or long connections, it is even more impossible to do both of the following: When scaling, if the client is using connection pools or long connections, it is not possible to break up the load to the additional compute nodes. When scaling down, if the client is using a connection pool or a long connection, there is no guarantee of zero user awareness because you kill the compute node and if there is a connection on it, the client reports an exception.

Estimated Time 180 days

sykp241095 commented 3 years ago

LGTM non-binding

winkyao commented 3 years ago

LGTM

ti-chi-bot commented 3 years ago

@winkyao: Please use GitHub review feature instead of /lgtm [cancel] when you want to submit review to the pull request. For how to use GitHub review feature, see also this document provided by GitHub.

For the reason we drop support to the commands, see also this page. This reply is being used as a temporary reply during the migration of review process and will be removed on July 1st.

Instructions for interacting with me using PR comments are available [here](https://prow.tidb.io/command-help). If you have questions or suggestions related to my behavior, please file an issue against the [ti-community-infra/tichi](https://github.com/ti-community-infra/tichi/issues/new?title=Prow%20issue:) repository.
sunxiaoguang commented 3 years ago

LGTM