Closed TeslaCN closed 1 year ago
期待早日实现
Hello , this issue has not received a reply for several days. This issue is supposed to be closed.
Since we have discussed about the difficulty of developing Vert.x in ShardingSphere. I'm going to remove Vert.x driver from ShardingSphere-Proxy soon.
Discussion could be referred to https://lists.apache.org/thread/0vd7h44bjjszc5fs2hpftktt4oh4hhw5
The following content is the proposal in discussion.
Split Vert.x code from ShardingSphere into separate branch
Currently ShardingSphere integrates Vert.x as the database driver of ShardingSphere-Proxy. ShardingSphere-Proxy MySQL using Vert.x as the database driver does have a certain performance improvement compared to using JDBC, but the improvement is not as large as expected. During the actual development and use of Vert.x-based ShardingSphere-Proxy, we encountered many problems:
- Vert.x-based asynchronous code increases coding complexity and debugging costs.
- The existing metadata loading logic is developed based on JDBC (blocking I/O model), and the workload of refactoring the metadata loading logic into asynchronous Vert.x is very heavy. Therefore, ShardingSphere-Proxy driven by Vert.x database cannot use cluster mode.
- The metadata code is coupled with JDBC, and requires some refactoring before working with Vert.x to decouple the code from JDBC.
- Vert.x does not have a mature solution for distributed transactions, and transactions have not reached a production-ready state.
- The ShardingSphere team doesn't have much energy to put into Vert.x driver.
- JDBC is standard for Java compared to Vert.x.
- Java 19 introduced Virtual Thread to improve performance without changing Java's multithreaded programming model. Although the performance of Virtual Thread has not yet reached the ideal state, it may be able to help ShardingSphere to improve the performance in the future without a lot of code modification. Therefore, we intend to separate the current Vert.x code in ShardingSphere into a separate branch for maintenance to reduce the cost of understanding and maintaining the main code
Concepts
We have tried hard tuning performance of ShardingSphere in the past few months, and we found the synchronous threading model may be a big bottleneck of the performance. As the backend of the ShardingSphere Proxy, JDBC is inherently synchronous, which is a concurrent performance bottleneck of ShardingSphere Proxy. The Proxy communicates with clients by database protocol, which means users don't need to consider how the Proxy interacts with database. So we may find another solution to avoid the performance bottleneck of JDBC.
Scheme comparison
I did some comparison and made the following table:
The current backend of ShardingSphere Proxy is JDBC, which is inherently synchronous and has poor performance in high concurrency scenario.
Why not implementing protocol on our own?
Implementing databases' protocol and doing packet forwarding may minimize performance loss. But there is too much we need to do, which may take a long very time. Such as:
Why not MySQL X Protocol?
Why not R2DBC?
Why Vert.x?
But, we still need to control distributed transactions on our own.
Connections Pool
Vert.x build-in connections pool has the following parameters:
Transaction
Local Transaction
We can control the local transaction just like how we did in
LocalTransactionManager
.Distributed Transaction
An issue opened 2 years ago and the conclusion is we need to manage distributed transaction on our own. https://github.com/eclipse-vertx/vert.x/issues/2939
XA
The JTA and XA in JDBC cannot be used in Vert.x. Transaction management such as Atomikos cannot be reused.
BASE
Seata
may not be reused.Integrating Vert.x into ShardingSphere
Phase 1 Vert.x coexists with JDBC (simple, but inelegant and inextensible)
This is how we did in preliminary performance research. The JDBC backend and Vert.x backend are coexisting until the Vert.x backend become mature. For each DataSource, we maintain a corresponding Vert.x pool in Proxy backend, which means each database will have 2 connections pool (HikariCP and Vert.x Pool). When loading MetaData or using native privileges, we use the JDBC DataSource to do that. When executing CRUD SQL, we use the Vert.x pool to do asynchronous things.
[x] Decouple
BackendConnection
in Proxy Backend My idea is extracting an interfaceConnectionSession
. The currentBackendConnection
rename toJDBCBackendConnection
and addVertxBackendConnection
.[ ] Implements Vert.x executor and callback in
shardingsphere-infra-executor
andshardingsphere-proxy-backend
[ ] Add new modules for reactive in Proxy
For example, those executors won't interact with database in frontend-mysql can be reused by
frontend-reactive-mysql
.Phase 2 Decoupling JDBC from ShardingSphere (heavy work, but elegant and extensible)
1 Define Configuration API
Vert.x connections pool can be created by URI:
2 Decoupling JDBC from ShardingSphere (heavy work)
There are many modules coupling with JDBC. We may need to decouple JDBC from modules except ShardingSphere JDBC.
Infra
For example, the class
ShardingSphereResource
holds aDataSource
map. We may decoupleDataSource
fromShardingSphereResource
by defining an interfaceResource
. The implementations may beJDBCResource
,VertxResource
orMySQLXResource
in the future.MetaDataLoader
There are many codes in
MetaDataLoader
coupling with JDBC.Mode
There are many codes in mode modules coupling with JDBC. And we need to consider (de)serialization.
Phase 3 Removing JDBC from ShardingSphere Proxy
Other Reference