canonical / dqlite

Embeddable, replicated and fault-tolerant SQL engine.
https://dqlite.io
Other
3.9k stars 219 forks source link

Cross platform support #593

Open AshfordN opened 4 years ago

AshfordN commented 4 years ago

In the readme it says that the Linux AIO API is used for disk I/O, which prevents this library from being cross platform. However, if the library already uses libuv, is there a reason why libuv isn't used as an abstraction for disk I/O?

freeekanayaka commented 4 years ago

The reason is that libuv's disk I/O uses a threadpool and hence is not fully asynchronous.

AshfordN commented 4 years ago

But what implications does that have exactly, apart from (maybe) performance? I'm interested in adding support for windows, but from what I'm seeing the AIO API is tightly coupled with the rest of the API without much abstraction. This makes it difficult to build out support for other platforms, despite the invitation to do so. I'm still new to the code base and I haven't fully learned the internals as yet, but my first instinct would be to leverage the libuv's disk I/O abstraction for windows support. However, I'm not sure if this is practical.

freeekanayaka commented 4 years ago

Yes, the only implication is performance.

And yeah you're right, unfortunately the implementation is currently a little bit coupled. I have planned to introduce an abstraction (also to move to the new io_uring Linux API, which is better than AIO), but didn't have time yet.

I think practically speaking the easiest way would be to fork and modify uv_writer.c to use libuv's disk I/O facilities instead of kernel AIO. That file is where most of the AIO-dependant code is.

After you get it working, we can see how to best abstract the two implementations and merge it.

freeekanayaka commented 4 years ago

Actually, now that I think of it, we don't even need an interface. Basically the abstraction that is already in place is the raft_io interface, for which the project provides a stock implementation based on libuv. The work to do would be to detect the build host at compile time and if it's windows fallback to libuv's support for disk I/O. Those #ifdef's should live in uv_writer.c.

AshfordN commented 4 years ago

Introduction

These are a few of my thoughts/observation based on my analysis of the code.

Naming conflict

The following function, defined in heap.h, conflicts with a function of the same name, defined in Windows' heapapi.h:

void HeapFree(void *ptr);

Obviously, this problem only occurs when targeting Windows, but the naming scheme of the functions defined in heap.h should probably be reviewed. For now, I've renamed them to MyHeapxxx() to avoid conflict, but I won't PR that change. I'll leave it up to you to decide on the best strategy here.

Progress

I haven't had much free time lately, but I've tried doing some work in implementing windows support - mostly in uv_writer.c as you said. However, there were some other things that needed to be addressed, like the naming conflict above and appropriate includes for uv_ip.h. So far, I've been able to successfully complete the build process using:

autoreconf -i
./configure --host=x86_64-w64-mingw32
make

But the examples still needs some work. Also I haven't tested the resulting libraries, so my changes are likely buggy. All changes are on my branch.

rabits commented 3 years ago

Hello folks, working on https://github.com/canonical/raft/pull/173 to continue this effort in MacOS direction, and found that during raft_add applyChange is running before the actual set of leader_state.change to run raftChangeCb and return something - that causing the second node to hang.

Particularly this part is bugging me: https://github.com/canonical/raft/blob/v0.9.25/src/client.c#L201-L207 - on Linux it's working just fine, but on MacOS the applyChange is actually executed before the r->leader_state.change = req; line. I tried to move it before the clientChangeConfiguration call (like here: https://github.com/canonical/raft/blob/v0.9.25/src/client.c#L277-L288 ) - but ended up with segfault... Maybe someone could suggest the proper way to fix that?

rabits commented 3 years ago

Ok, finally found the actual issue and now PR contains the working patch.