canonical / dqlite

Embeddable, replicated and fault-tolerant SQL engine.
https://dqlite.io
Other
3.86k stars 216 forks source link

C client implementation #501

Closed cole-miller closed 1 year ago

cole-miller commented 1 year ago

I'm sending this up as one PR, adding commits incrementally. The first commit is just the new header file dqlite/client.h, so that any tweaks to that interface that affect the implementation can get ironed out sooner rather than later (I do have a lot of the implementation written locally).

Signed-off-by: Cole Miller cole.miller@canonical.com

codecov[bot] commented 1 year ago

Codecov Report

Merging #501 (06ddfe8) into master (c3ac4d8) will decrease coverage by 4.15%. The diff coverage is 0.64%.

@@            Coverage Diff             @@
##           master     #501      +/-   ##
==========================================
- Coverage   60.97%   56.82%   -4.15%     
==========================================
  Files          34       35       +1     
  Lines        6340     6801     +461     
  Branches     1884     2013     +129     
==========================================
- Hits         3866     3865       -1     
- Misses       1293     1755     +462     
  Partials     1181     1181              
Impacted Files Coverage Δ
src/client/easy.c 0.00% <0.00%> (ø)
src/client/protocol.c 42.31% <37.50%> (-0.36%) :arrow_down:

... and 1 file with indirect coverage changes

cole-miller commented 1 year ago

Sample usage of the client API:

#include <dqlite/client.h>

int main(int argc, char **argv) {
    int rv;
    if (argc != 2 && argc != 3) { abort(); }
    char *my_addr = argv[1];
    // We only need the boostrap address when starting up for the first time.
    // If we've already initialized and refreshed the local node store at least
    // once, the client implementation will use that information to decide who
    // to contact.
    char *bootstrap_addr = NULL;
    if (argc == 3) { bootstrap_addr = argv[2]; }
    dqlite_cluster *cluster;
    rv = dqlite_cluster_create("./dqlite-stuff/", &cluster);
    if (rv != 0) { abort(); }
    // (dqlite_cluster_start will fail if no bootstrap node was declared and
    // no local node store was found.)
    rv = dqlite_cluster_declare_node(cluster, my_addr, DQLITE_CLUSTER_ME);
    if (rv != 0) { abort(); }
    if (bootstrap_addr) {
        rv = dqlite_cluster_declare_node(cluster, bootstrap_addr, DQLITE_CLUSTER_BOOTSTRAP);
        if (rv != 0) { abort(); }
    }
    rv = dqlite_cluster_start(cluster);
    if (rv != 0) { abort(); }

    dqlite *db;
    rv = dqlite_open(cluster, "mydb", &db);
    if (rv != 0) { abort(); }
    // From this point on, usage looks exactly like SQLite.
    rv = dqlite_exec(db, "CREATE TABLE IF NOT EXISTS foo (n INTEGER)", NULL, NULL, NULL);
    if (rv != 0) { abort(); }
    rv = dqlite_exec(db, "INSERT INTO foo (n) VALUES (1)", NULL, NULL, NULL);
    if (rv != 0) { abort(); }
    dqlite_stmt *stmt;
    rv = dqlite_prepare(db, "SELECT * FROM foo WHERE n = ?", -1, &stmt, NULL);
    if (rv != 0) { abort(); }
    rv = dqlite_bind_int64(stmt, 1, 1);
    if (rv != 0) { abort(); }
    rv = dqlite_step(stmt);
    assert(rv == SQLITE_ROW);
    assert(dqlite_column_int64(stmt, 0) == 1);
    rv = dqlite_step(stmt);
    assert(rv == SQLITE_DONE);
    dqlite_finalize(stmt);
    dqlite_close(db);

    return 0;
}
freeekanayaka commented 1 year ago

The core client part (the functions equivalent to sqlite3) looks good to me. I'd like to understand the cluster part better.

EDIT: I think I have now a better grasp of the cluster part, looks good too to me, modulo minor details can be discussed individually.

freeekanayaka commented 1 year ago

I guess in practice we'll decide what parts of dqlite_node makes sense to expose also to dqlite_cluster and what don't, like we do for the Go app helper.

I don't see many ways to avoid this, since dqlite_cluster is basically a higher-level construct around the lower level dqlite_node.

freeekanayaka commented 1 year ago

After some more thought, I believe I now realize where the the friction between dqlite_cluster and dqlite_node comes from (the fact that dqlite_cluster would need to play "catch-up" with dqlite_node features).

The current APIs both in dqlite and in go-dqlite grew somehow organically after we figured out how dqlite should be consumed. For example, the app.App API of go-dqlite was born after we integrated dqlite into LXD and gathered more experience about how projects would consume dqlite and what their needs are.

Arguably the app.App Go package from go-dqlite doesn't have anything which is strictly Go-specific, and all the functionality/logic it implements is actually required by every project consuming dqlite, independently from the language (Go, C, Python, whatever). That was probably sort of clear even back then when app.App was implemented, but it was still written in Go as there wasn't much time available for develpment (app.App was just a by-product of the implementation of the Kubernetes kine/dqlite backend, which had a tight delivery deadline), and it was also a good way to prototype the core abstractions of the API . We're now starting to fix that: role management is being moved to the C side (and later down the road it can probably be dropped from app.App) and dqlite_cluster will basically move the rest of app.App, implementing equivalent logic in C, and making it easy to reuse it in any language via C bindings. After everything will be implemented in C, app.App will probably won't have much reasons to exist (at least in principle, we'll keep it around for backward compatibility), and new Go bindings could be created that leverage dqlite_cluster. So far so good, and I believe we're all on the same page up to here.

I would go one step forward, and point out that dqlite_cluster is implementing the API that dqlite_node should ideally have had from the very beginning (I'm talking about the functionality, the naming of structs and functions is just a detail). In other words, if we were writing dqlite from scratch now, we would expose the API that dqlite_cluster offers, not the current one of dqlite_node. After dqlite_cluster is in place, there won't be much reason for any dqlite consumer to use dqlite_node, in the same way that there is now little reason for Go consumers to use dqlite.Node instead of dqlite.app.App.

It'd be nice to take this chance to fix our core abstractions (namely dqlite_node).

What I propose is:

Below is a reference sketch, basically just a rename of dqlite_cluster to dqlite_server.

It has also a small adjustment, breaking up dqlite_cluster_declare_node into dqlite_server_set_address and dqlite_server_set_peers. I'm not quite sure here of the best API of this, in terms of naming and semantic, just wanted to stress the fact that this is a local config not a cluster one, this is a bit of a detail and suggestions are welcome (perhaps we can keep the API same as dqlite_cluster_declare_node and just change the name, but I could not come up with a good one).

typedef struct dqlite_server dqlite_server;
typedef int (*dqlite_connect_func)(void *, const char *, int *);

/**
 * Create a dqlite server that will be part of a dqlite cluster and begin
 * configuring it.
 *
 * @path is the path to a directory where the dqlite server will store
 * persistent state. The directory must exist before calling this function.
 * On success, a pointer to the created object is stored in @server.
 *
 * No reference to the string pointed to by @path is kept after this
 * function returns.
 */
int dqlite_server_create(const char *path, dqlite_server **server);

/**
 * Set the network address of the given dqlite server.
 *
 * Other dqlite servers must be able to connect to this server using the
 * given address.
 *
 * This function should be called when starting a server for the very first
 * time. After that first start, calling this function is optional, since the
 * address will be stored locally and will be automatically used during
 * subsequent restarts. If called again, this function should be passed the
 * same value as the first time.
 * 
 * @address is the (abstract) address of the server.
 *
 * No reference to the string pointed to by @address is kept after this function
 * returns.
 */
int dqlite_server_set_address(dqlite_server *server,
                const char *address);

/**
 * Inform this server about existing servers already part of the cluster.
 * 
 * This function should be called when starting for the very first time a
 * server that is meant to join an existing cluster of dqlite servers .
 * 
 * If the server is meant to be the first server of a new cluster (i.e. it's
 * the "bootstrap" server), then this function should not be called or
 * it should be passed an empty list of peers.
 *
 * After that first start, calling this function is optional, and it will be a
 * no-op (information about the cluster members is stored locally and
 * refreshed at regular intervals).
 *
 * @peers is an array of (abstract) addresses of existing servers of the
 * cluster that this server should join. The list does not need to be
 * exhaustive and contain all the existing servers of the cluster, even just
 * one existing server is enough, as long as it's currently running.
 * @n_peers is the length of the @peers array.
 *
 * No reference to the strings contained in @peers or to @peers itself is
 * kept after this function returns.
 */
int dqlite_server_set_peers(dqlite_server *server,
                const char **peers,
                unsigned n_peers);

/**
 * Set the address on which the server will listen for requests
 * from clients and messages from other servers.
 *
 * The format of the @address argument must be one of
 *
 * 1. "<HOST>"
 * 2. "<HOST>:<PORT>"
 * 3. "@<PATH>"
 *
 * Where <HOST> is a numeric IPv4/IPv6 address, <PORT> is a port number, and
 * <PATH> is an abstract Unix socket path. The port number defaults to 8080 if
 * not specified. In the second form, if <HOST> is an IPv6 address, it must be
 * enclosed in square brackets "[]". In the third form, if <PATH> is empty, the
 * implementation will automatically select an available abstract Unix socket
 * path, which can then be retrieved with dqlite_server_get_bind_address().
 *
 * If an abstract Unix socket is used the dqlite server will accept only
 * connections originating from the same process.
 *
 * No reference to the memory pointed to by @address is kept, so any memory
 * associated with them can be released after the function returns.
 */
int dqlite_server_set_bind_address(dqlite_server *server,
                       const char *address);

/**
 * Customize the function that the server will use to connect to other servers.
 *
 * The same callback will also be used by the client returned by dqlite_server_client().
 * 
 * The callback may be invoked concurrently on multiple threads, and is responsible for handling
 * any required synchronization internally.
 */
int dqlite_server_set_connect_func(dqlite_server *server,
                    dqlite_connect_func connect,
                    void *connect_arg);

/**
 * Configure the target number of voting nodes for the server's cluster.
 *
 * This must be an odd number and should be set to the same value
 * on all servers of the cluster.
 */
int dqlite_server_set_voters(dqlite_server *servers, unsigned n);

/**
 * Configure the target number of standby nodes for server's cluster.
 *
 * This must be set to the same value on all servers of the cluster.
 */
int dqlite_server_set_standbys(dqlite_server *server, unsigned n);

/**
 * Launch the server.
 * 
 * If it the first time the server is started, and it's not the bootstrap
 * server, it will attempt to join the target cluster configured using
 * dqlite_server_set_peers().
 */
int dqlite_server_start(dqlite_server *server);

/**
 * Create a client for connected to server's cluster.
 *
 * This a helper around dqlite_open(), that automatically passes it the members
 *  of server's cluster.
 */
int dqlite_server_client(dqlite_server *server, const char *name, dqlite **db);
cole-miller commented 1 year ago

I personally don't think duplicating API surface between dqlite_node and dqlite_cluster (or whatever we end up calling it) is a problem (it's not very much code and exposing a new option on dqlite_cluster is a very small diff) -- and I'd rather not block work on the client interface on making a 2.0 of the server interface. Maybe it's possible to work on the client independently for now, but in a way that keeps the option of updating the server as you describe later on?

cole-miller commented 1 year ago
int dqlite_server_set_peers(dqlite_server *server,
              const char **peers,
              unsigned n_peers);

I think this interface is not as good as dqlite_cluster_declare_node (happy to bikeshed the name):

freeekanayaka commented 1 year ago

I personally don't think duplicating API surface between dqlite_node and dqlite_cluster (or whatever we end up calling it) is a problem (it's not very much code and exposing a new option on dqlite_cluster is a very small diff) -- and I'd rather not block work on the client interface on making a 2.0 of the server interface. Maybe it's possible to work on the client independently for now, but in a way that keeps the option of updating the server as you describe later on?

It's not about the diff or about duplication. It's about the fact that dqlite_node should not be used and keeping both API sets just promotes confusion.

What I'm proposing would not block client work in any way, I'm not sure why you think so. It's a very small name change on top of what you are suggesting, to help users understand what API they should use and not end up with a confusing set of two APIs, one of which is effectively legacy (dqlite_node) and the other one with a name that feels to be managing clusters (dqlite_cluster) but it's actually managing nodes and is effectively the successor of dqlite_node. So, in practice what I'm suggesting is s/dqlite_cluster/dqlite_server/ moving it to dqlite.h + deprecating dqlite_node. That's it. YMMV, it's your work..

freeekanayaka commented 1 year ago
int dqlite_server_set_peers(dqlite_server *server,
            const char **peers,
            unsigned n_peers);

I think this interface is not as good as dqlite_cluster_declare_node (happy to bikeshed the name):

  • it requires an additional allocation for the array of peers
  • it does not support declaring the node's own abstract address with DQLITE_CLUSTER_ME

That'd done with dqlite_server_set_address.

or the bootstrap node's abstract address with DQLITE_CLUSTER_BOOTSTRAP. The first is necessary so each instance (I've been using this term in my head for the pair of "server + associated client") can add itself to the cluster on startup.

Right, that be:

dqlite_server_set_address(s, "1.2.3.4")
const char *peers[1] = {"3.4.5.6"} 
dqlite_server_set_peers(s, peers, 1);

Assuming 3.4.5.6 is the address of an existing server, then the server in the code above will start with address 1.2.3.4 and join the cluster.

and the second is useful to optimize the first startup and also to avoid this caveat in the new docstring:

If the server is meant to be the first server of a new cluster (i.e. it's the "bootstrap" server), then this function should not be called or it should be passed an empty list of peers.

The thing is that you don't generally need to pass DQLITE_CLUSTER_BOOTSTRAP. Any address of a running existing server in the cluster is sufficient. Actually, after some days of weeks the original DQLITE_CLUSTER_BOOTSTRAP node might have even be removed from the cluster. What I'm trying to say is that in order for a new node to start and join the cluster all it needs is its own address and the address of at least one other node in the cluster. It does not need all nodes and it does not need to know what of those nodes is the bootstrap node. So requiring or encouraging passing that information seems an extra burden. Again, YMMV.

cole-miller commented 1 year ago

Sorry, it's my fault for responding glibly while I was feeling frustrated and not taking the time to understand what you were suggesting. I appreciate your feedback and will try to be more measured.

The thing is that you don't generally need to pass DQLITE_CLUSTER_BOOTSTRAP. Any address of an running existing server in the cluster is sufficient. Actually, after some days of weeks the original DQLITE_CLUSTER_BOOTSTRAP node might have even be removed from the cluster. What I'm trying to say is that in order for a new node to start and join the cluster all it needs is its own address and the address of at least one other node in the cluster. It does not need all nodes and it does not need to know what of those nodes is the bootstrap node.

My thinking -- which I didn't make sufficiently clear, sorry -- was that it's useful to know which node is the bootstrap node, because that node will be the leader when the cluster is starting for the first time, so that's where we should send the initial ADD request for each starting node. But that might not be important.

I think your proposed split of dqlite_server_set_address + dqlite_server_set_peers and my dqlite_cluster_declare_node accomplish the same goals, and I'd be fine with either of them.

As for the dqlite_node -> dqlite_server proposal, again, I have to apologize for being glib and responsing without taking the time to digest your comment. I designed dqlite/client.h to be completely independent from dqlite.h, with the idea that everything you'd need to e.g. replace SQLite with dqlite in your application would be declared in dqlite/client.h, leaving dqlite.h as the basis for consumers like go-dqlite that want to take the client side into their own hands. But it would also be perfectly fine to do what you propose and have dqlite_server as a type that's shared between those two headers, and I'll defer to you on which is better.

freeekanayaka commented 1 year ago

Sorry, it's my fault for responding glibly while I was feeling frustrated and not taking the time to understand what you were suggesting. I appreciate your feedback and will try to be more measured.

Ok, no worries!

The thing is that you don't generally need to pass DQLITE_CLUSTER_BOOTSTRAP. Any address of an running existing server in the cluster is sufficient. Actually, after some days of weeks the original DQLITE_CLUSTER_BOOTSTRAP node might have even be removed from the cluster. What I'm trying to say is that in order for a new node to start and join the cluster all it needs is its own address and the address of at least one other node in the cluster. It does not need all nodes and it does not need to know what of those nodes is the bootstrap node.

My thinking -- which I didn't make sufficiently clear, sorry -- was that it's useful to know which node is the bootstrap node, because that node will be the leader when the cluster is starting for the first time, so that's where we should send the initial ADD request for each starting node. But that might not be important.

It's important, but there's a solution, because if you don't know who the leader is but you know the address of any other existing node, than that node can tell you who the leader is, and the ADD can be directed to it. You'll likely going to get that behavior for free, if you use the client machinery to send the ADD request: the client will have code to find the leader anyways.

So the problem of not knowing what the bootstrap node is, has a solution.

However requiring to know what the bootstrap node is, has no easy solution I think, for example in the case you're adding a node much later in the lifecycle of the cluster, and the original bootstrap node is not there anymore. You can surely make the proposed version of dqlite_cluster_declare_node workaround that, but I'm raising a flag here to point that there are some rough edges that might end up being confusing.

I think your proposed split of dqlite_server_set_address + dqlite_server_set_peers and my dqlite_cluster_declare_node accomplish the same goals, and I'd be fine with either of them.

Indeed they are mostly equivalent, and there are drawbacks in each case. We can also think about other options. I feel that talking about these aspects is important, it's easy to get public APIs out in the wild, but once they are out you can't take them back. So in my mind it's worth spending time designing an aspect that will be around for years, to make sure most of the ramifications have been taken into account. I personally don't feel completely set on either of the two proposals yet, and would be happy to think about them more and looking at them from various angles.

Note that this shouldn't block the implementation, since these type of API-level changes are usually not very laborious once the core implementation is there.

As for the dqlite_node -> dqlite_server proposal, again, I have to apologize for being glib and responsing without taking the time to digest your comment. I designed dqlite/client.h to be completely independent from dqlite.h, with the idea that everything you'd need to e.g. replace SQLite with dqlite in your application would be declared in dqlite/client.h, leaving dqlite.h as the basis for consumers like go-dqlite that want to take the client side into their own hands. But it would also be perfectly fine to do what you propose and have dqlite_server as a type that's shared between those two headers, and I'll defer to you on which is better.

We should take everything into account, and see how everything fits in the broader picture.

I think we're mostly saying the same thing: for what I can see the dqlite/client.h you are describing is what every dqlite consumer will want to use, including go-dqlite which implements its own client (or any other language binding that implements its own client). The reason is that the dqlite/client.h in this PR is not only a client, but also has the dqlite_cluster features which are essential for all dqlite consumers. If dqlite_cluster had been there before go-dqlite existed, go-dqlite would have used that instead of re-inventing the wheel with the app.App package.

I don't have an opinion about whether dqlite.h should be separate from dqlite/client.h. We could also put everything in dqlite.h, since we're most probably not going to ship 2 different shared libraries (one for the client and one for the server), so having separate headers is somehow arbitrary. Unless I'm missing something.

If we put everything in dqlite.h, the goal of having a single file that has everything you need to replace SQLite in your application is preserved. And the goal of exposing the newer and richer dqlite_cluster/dqlite_server API more properly would be preserved too: it would supersede the deprecated dqlite_node API, and non-C consumers like go-dqlite that implement their own native client would use the dqlite_cluster/dqlite_server C bindings for the server, instead of dqlite_node which is weaker and puts more burden on them.

cole-miller commented 1 year ago

Okay, putting everything in dqlite.h and making dqlite_cluster (or whatever we're calling it) a full replacement for dqlite_node sounds good to me. Thanks for the feedback!

freeekanayaka commented 1 year ago

Okay, putting everything in dqlite.h and making dqlite_cluster (or whatever we're calling it) a full replacement for dqlite_node sounds good to me. Thanks for the feedback!

Sounds good thanks! I'd suggest dqlite_server to make it clear that its functions act on a single node and not on the cluster as a whole.

cole-miller commented 1 year ago

@freeekanayaka If we're going to overhaul dqlite.h, should we get rid of or deprecate the dqlitevfs* declarations while we're at it? I don't know who is using them (not go-dqlite) and they feel to me like they're exposing implementation details that should really be private.

freeekanayaka commented 1 year ago

@freeekanayaka If we're going to overhaul dqlite.h, should we get rid of or deprecate the dqlitevfs* declarations while we're at it? I don't know who is using them (not go-dqlite) and they feel to me like they're exposing implementation details that should really be private.

Those were meant to be part of a longer term effort to have a sort of "plumbing" layer for dqlite useful for testing and customization. We're not there yet, and in any case we might want those APIs to live in a separate header file, not in the main dqlite.h "porcelain" header.

We could start by adding a DEPRECATED prefix to their docstring, as for all dqlite_node_xxx functions and indicate that they are being kept in dqlite.h just for backward compatibility with the 1.x series, but there is no guarantee about them in the 2.x series. Or alternatively, since we don't expect them to have real-world consumers, they could already be moved to some other place, e.g. a new ./include/dqlite/internal.h header, not meant for public, or something like that.

cole-miller commented 1 year ago

Oh, it seems GCC doesn't correctly implement #pragma GCC diagnostic ignored in all cases, that's quite annoying.

freeekanayaka commented 1 year ago

I agree that kind of extensibility could be useful in the future, but it seems like it would be simple enough to do what SQLite does by adding a dqlite_open_v2 with the signature you suggest in a future release, while offering dqlite_open with the simple semantics. Does that make sense to you?

I'm personally okay with the _v2 approach for changing/extending existing APIs. There was some push back on this approach from other people in the past, but if everyone in the current dqlite team is fine with that we could decide that the _v2-type approach is what we'll do in general in these cases and so we don't need to future-proof new APIs too hard.

cole-miller commented 1 year ago

(Sorry for the previous botched comment, that was GitHub's attempt to render my reply to an email thread...)

cole-miller commented 1 year ago

I'm just pushing up implementation commits now, one or two functions at a time. Feel free to review, but I might still rewrite history as I notice mistakes/points of incompleteness

freeekanayaka commented 1 year ago

I'm just pushing up implementation commits now, one or two functions at a time. Feel free to review, but I might still rewrite history as I notice mistakes/points of incompleteness

Thanks.

Would it be possible to break this down into several PRs? Ideally each PR would implement a coherent bit of work towards the final goal, with its own set of commits.

For example:

Note that these PRs don't need to be the "final" version of the code they contain, they can be just minimum viable versions that might be subject to further changes and incremental PRs (either to extend them or to change them a bit).

All of this would make reviewing (our side) and addressing review comments (your side) much easier and more manageable.

freeekanayaka commented 1 year ago

PS: you can also wait a bit longer before start pushing the first PRs, if the associated code (for example, say, struct node_store_cache) is still highly unstable and you need more time to stabilize it and rewrite history, as you mentioned yourself.

But it'd be still be beneficial to have that work broken down into several PRs and sub-commits, which, again, by no means need to be final. Even in case of change it will be easier to review incrementally and frequently vs having a very big PR that takes a lot of time to become final.

cole-miller commented 1 year ago

Sure, I'll split it up.

cole-miller commented 1 year ago

Closing in favor of #507 and future small PRs