gatewayd-io / gatewayd

☁️ Cloud-native database gateway and framework for building data-driven applications ✨ Like API gateways, for databases ✨
https://gatewayd.io
GNU Affero General Public License v3.0
201 stars 16 forks source link

Enable multi-pool client connections #398

Open mostafa opened 6 months ago

mostafa commented 6 months ago

At the moment GatewayD supports a pair of connection pools for managing available and busy connections. On one hand, these pools enables many incoming client, limited by the capacity of the pools, to be connected to a single database server. On the other hand, each client connection is mapped to a single server connection. This works, but is not ideal.

The idea here is to let GatewayD connect to multiple databases at the same time per configuration group, aka. tenant. Each configuration group should have configuration for multiple clients, pools, proxies and a single server. Note that clients, pools and proxies have named groups, yet there will be a single server listening for all of these databases. The distributionStrategy and the splitStrategy config parameters of the server object will decide how the connection are routed to their corresponding database connections (clients).

This should eventually enable a range of possibilities including routing, switching and relaying, to name the least.

This is a possible task list:

This is what the global config might look like after this change, yet other ideas should be explored.

clients:
  default:
    active-writes: # <- This is the "named config group"
      network: tcp
      address: localhost:5432
      ...
    standby-reads:
      network: tcp
      address: localhost:5433
      ...

pools:
  default:
    active-writes:
      size: 10
    standby-reads:
      size: 10

proxies:
  default:
    active-writes:
      healthCheckPeriod: 60s
    standby-reads:
      healthCheckPeriod: 60s

servers:
  default:
    distributionStrategy: ab-testing, canary, write-read-split (?), round-robin, murmur-hash, etc.
    splitStrategy:
      active-writes: 90 # percent
      standby-reads: 10 # percent (automatically deduced)
    ...

Related

Resources

mostafa commented 3 months ago

Hey @likecodingloveproblems,

Would you like to work on this feature? This enables a lot of possibilities.

likecodingloveproblems commented 3 months ago

Actually I am interested :)

mostafa commented 3 months ago

Awesome! :raised_hands:

likecodingloveproblems commented 3 months ago

I think this schema is correct not the above one: @mostafa

clients:
    active-writes: # <- This is the "named config group"
      network: tcp
      address: localhost:5432
      ...
    standby-reads:
      network: tcp
      address: localhost:5433
      ...

pools:
    active-writes:
      size: 10
    standby-reads:
      size: 10

proxies:
    active-writes:
      healthCheckPeriod: 60s
    standby-reads:
      healthCheckPeriod: 60s

servers:
  default:
    distributionStrategy: ab-testing, canary, write-read-split (?), round-robin, murmur-hash, etc.
    splitStrategy:
      active-writes: 90 # percent
      standby-reads: 10 # percent (automatically deduced)
    ...
mostafa commented 3 months ago

@likecodingloveproblems Now that I am rethinking it, it looks better the way you mentioned and also it'll be less nested. And then, the run command, and subsequently the server object, should be aware of non-matching set of config groups, especially in count and label, and then act accordingly based on the given parameters. You can find instances of [name], that refer to the config objects in the cmd/run.go file, which will eventually become objects stored in these global variables.

likecodingloveproblems commented 3 months ago

I try to draw the ERD. Now I have two questions:

  1. In the config file, we have a server object, we must relate proxies to servers as we want to have a one to many relation ship.
  2. Now if a connection is closed or opened, we have OnOpen and OnClose functions on the Server, and get the server's Proxy and with a Map by O(1) the Connection is gotten, but now as the relation between Server and Proxy is becoming one to many, we must loop on all the proxies and then try to Find Connection's Proxy, Do you have any idea to have a better design?
mostafa commented 3 months ago
  1. Correct.
  2. We should figure out either 1) which proxy the connection wants to connect to or 2) which proxy the server wants to assign the connection to. Then we can store this information in a pool object inside the server, just like how available and busy connections work. Since the pool object is generic, the key can be the connection and the value can be the proxy it was assigned to.
mostafa commented 2 months ago

Hey @likecodingloveproblems,

I saw your awesome work you did in this PR. Is this still in progress?

likecodingloveproblems commented 2 months ago

Hi @mostafa Thanks for your interest. yes, I am working on it in weekends :)

likecodingloveproblems commented 2 months ago

I think split strategy is not related to server and must be a distribution strategy's parameter. also for some strategies we don't need split strategy like round robin.

mostafa commented 1 week ago

Hey @likecodingloveproblems,

Any updates here?

sinadarbouy commented 3 days ago

@mostafa If I understand correctly, by removing nesting in clients, pools, and proxies, we can separate the server from the group of clients, pools, and proxies. This allows us to have two servers using the same proxy. Is that correct?

mostafa commented 3 days ago

@sinadarbouy No. The idea is to connect a single server instance to multiple proxies, each with different pools and different set of clients.