superfly / litefs

FUSE-based file system for replicating SQLite databases across a cluster of machines
Apache License 2.0
3.96k stars 95 forks source link

is a proxy needed ? #96

Closed gedw99 closed 2 years ago

gedw99 commented 2 years ago

web servers using litefs don't know which litefs server to read from or write from.

SO i presume a proxy ir needed, so that this is automatically managed for them ?

also what about a SQL statement that has a write and read in it ? Again i presume the proxy would handle this ?

https://github.com/CECTC/dbpack is an golang db proxy that does what i am suggesting.

gedw99 commented 2 years ago

https://github.com/superfly/litefs/issues/42 seems to suggest that the application code must direct which server is used ?

benbjohnson commented 2 years ago

@gedw99 I'm writing up some docs right now that should hopefully make this all clearer. You'll need some kind of proxy for writes but not necessarily for reads. Only a single node can be a primary at any given time but all nodes in the cluster have a full copy of the database. If your application can conform to GET being read-only and POST/PUT/PATCH/DELETE/etc being read/write then it's fairly easy to implement an http.Handler as a proxy in front of your application.

With Fly.io, we have a fly-replay response header that can be set so that requests can be replayed on a different node. However, you can use any kind of proxying layer you want to such as the httpproxy library that's in golang.org/x.

I'm not sure dbpack would be the best solution as it adds a network communication layer into the transaction. One of the best parts of SQLite is there is such little transaction overhead since the database is in-process.

also what about a SQL statement that has a write and read in it?

If you are running a write query on a node that is not the primary then you'll get an error back from SQLite. There is an issue open for Write Forwarding (#56) but that's meant to simplify infrequent write transactions such as migrations. It has a performance hit to it.

gedw99 commented 2 years ago

thanks @ben yeah i remember the excellent blog writeup about how the proxy oes replay and also "delayed sticky" LB to stick you on the write db until your mutation has hit all the red only DB's. nice practical stuff. I like it.

fly-replay: https://fly.io/docs/getting-started/multi-region-databases/ is not open soure i think ? I dont blame fly company for not opening up their proxy layer. But a proxy for LiteFS is going to be needed to realise its potential.

Anyways, looking forward to the docs !!! would be happy to dog food the docs.

benbjohnson commented 2 years ago

fly-replay: https://fly.io/docs/getting-started/multi-region-databases/ is not open soure i think ? I dont blame fly company for not opening up their proxy layer. But a proxy for LiteFS is going to be needed to realise its potential.

That's correct, the Fly proxy isn't open source but it's just one way to proxy. AFAIK, though, it's not too complicated of a proxy mechanism. The proxy sends the request to a node and if that node returns a response with the fly-replay header then the proxy will resend the request to another node. The request is a stream of bytes so it just needs to buffer the header in order to replay to another node.

However, you could also run a proxy at the application level instead of in the reverse proxy.

Anyways, looking forward to the docs !!! would be happy to dog food the docs.

I put up a rough cut of 3 pages of docs yesterday up on the new docs site: https://fly.io/docs/litefs/

It has a Getting Started guide, a How it Works guide, and a reference on the config settings.

gedw99 commented 2 years ago

thanks. good points. docs look great btw...

For liteFS DB it has to be a L4 proxy normally since its tcp ?

benbjohnson commented 2 years ago

The proxying is entirely up to the application so it'd be L7. The application needs to make a decision on whether a request is read-only or read/write since any node can service read-only requests. If your application handles GET requests as read-only and everything else as read-write then HTTP methods are an easy way to generically proxy HTTP.

gedw99 commented 2 years ago

ok thanks. wil try this out.