axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.81k stars 399 forks source link

Header-only C++ compliant fork #957

Open the-moisrex opened 11 months ago

the-moisrex commented 11 months ago

For my own web framework, I needed to have header-only C++ compliant version of liburing, so I've forked it and did that.

There are still some UBs and Compiler-Specific stuff that I need to sort out, but right now it's a drop-in replacement.

I would like to have liburing's repo itself to host the header-only version of this as well so both header-only and future versions of liburing can be packages in one repo and grow together; but there are non-compatible changes in the header-only version which CAN be managed but not merged directly.

So I'm here to see what @axboe and other people here think about growing liburing itself to provide other options as well, like:

Some of the changes so far:

There are more small changes as well, and I have probably created a few problems as well, but I hope I can fix them.

Here's the link to the fork containing the changes said above.

cmazakas commented 10 months ago

I'm really not sure what the motivation is to make liburing header-only.

In my experience, C++ devs typically only want this because it "solves" distribution of their library but this is a poor solution to the problem.

What's the actual motivation for such a change?

Also, liburing's main API is header-only, which is what we want. I don't see much point in duplicating the definitions for such things as io_uring_submit() across multiple TUs. And I say this as a C++ dev writing my own I/O runtime.

the-moisrex commented 10 months ago

because it "solves" distribution of their library, that's the motivation, and no, it's not a poor solution to the problem.

Also, what do you mean liburing's main API IS header-only? How am I gonna use a non-header-only library in a header-only way?

cmazakas commented 10 months ago

Making a library arbitrarily header-only to solve the issue of distribution is a poor solution to the problem.

The proper thing to do is ship your project as something like a CMake project and then use find_package() which is idiomatic.

To me, there's no benefit here to making liburing header-only.

One thing you could argue is that maybe liburing should also install a Findliburing.cmake module when it's installed as a debian or installed to the configured directory during ./configure. I think that'd be a good "fix".

Also, what do you mean liburing's main API IS header-only?

Every io_uring_prep_* function is a static inline in the liburing.h header file.

Here, this is a working Findliburing.cmake file you can use: https://github.com/cmazakas/fiona/blob/3b40e456626363dd521348c2d270cd58e0c8dfad/cmake/Findliburing.cmake

You use it by having something like this in your CMake toolchain file:

list(APPEND CMAKE_PREFIX_PATH /home/exbigboss/cpp/__install__)

where my cpp/__install__ path is the one I used in ./configure --prefix=... when building liburing.

the-moisrex commented 10 months ago

I feel like you're confusing the word "header-only" with just a project with a couple of .h files; you're underestimating the scope of a possible header-only library, look at boost libraries, or even mine here.

And I feel like you're deliberately ignoring all other changes that I have made and plan to do:

liburing is so reliant on compiler-specific features and undefined behaviors that you CAN'T use it in a STANDARD C, let alone C++ (the emphasis is on the word "standard").

The -pedantic friendly part that I mentioned refers to liburing RELYING on UBs; and that reliant is in the header files; which means you can't use this library in a project that restricts the compiler to use standard c/c++ (e.g. passing -pedantic -std=c18)

Also, one-liner static inline functions are not the main part of this library, you can't use anything from this library without linking to the dynamic parts of the library; I'm confused why you even mentioned that.

cmazakas commented 10 months ago

I can't really speak to the UB you claim to have found in liburing's impl. But on some level, everything is UB deep down but you just say, "I know the system well enough, it's gonna work" and we just move on with life. Considering liburing is Linux-only and is going to stay Linux only for a long time, I don't necessarily see any issues here because everything is clean with ubsan + asan for me.

What I meant by header-only was, the important parts of liburing are the io_uring_prep_* functions. But those are header-only so you don't have to pay the cost of an opaque function call to use something well-named. Imo, this is good enough and gives users exactly what they want. io_uring_submit() can stay compiled, for example.

I guess I just don't see the issue with liburing not being header-only. If anything, a non-compiled version of liburing is most likely actively worse.

As for compiling liburing's source by eschewing the autotools dance, I don't recommend this. liburing knows how it should be built, let it be built how it wants to be built.

If you're having trouble with packaging and distribution, look into vcpkg which, surprisingly, doesn't require maintaining a literal transliteration of an entire library.

Also, just include liburing as a system header and your issues about -pedantic or w/e will go away.

the-moisrex commented 10 months ago

UB means: just because it works, doesn't mean it should.

liburing is a good library, if it wasn't, I wouldn't have tried to improve/fix it; but that doesn't mean just because it's useful for many project, you can use everywhere; the attempt to make it C/C++ compliant, and make it header-only, only expands the places where this library could be used.

Just because it's been good enough for your project, doesn't mean it's good enough for mine.

There are still a few things I need to fix, though.

I did open up issues regarding the compiler-specific features and UBs and -pedantic here, but those seem to be ignored for a long time, so I had to do them myself and if I have to do them myself, I'm gonna do them my way.

cmazakas commented 10 months ago

You haven't answered the question.

A library being header-only does nothing to expand the scope of users.

Why does liburing need to have a header-only fork? What's the purpose here?

the-moisrex commented 10 months ago

Why does liburing need to have a header-only fork? What's the purpose here?

Because that's how open source works!

cmazakas commented 10 months ago

Right.

I think you're in over your head here. I'll explain more when I can actually type for real and not on my phone.

the-moisrex commented 10 months ago

I think you're in over your head here.

Maybe, or maybe not! 🤔

One can only know by trying.

cmazakas commented 10 months ago

Right, so here's the thing.

liburing is an io_uring wrapper full of syscalls. It's also mainly written by a kernel dev. Kernel devs are aware of -pedantic and what UB is.

The authors understand what flags are required to build liburing and how to build it reliably.

You're pushing very hard for a "UB-free" -pedantic compliant version of liburing's source code. This is an anti-pattern for multiple reasons.

It means you're assuming that all software packages can be built under the same set of flags as your project's when the reality is, this doesn't work in industry or practice.

This is why you should respect the autotool workflow.

For reference, building liburing is easy:

./configure --prefix=/my/install/prefix && make -j20 && make install

Second, making liburing header-only doesn't help with distribution because liburing is licensed however it's licensed. Source or compiled source doesn't really matter, you're limited here entirely by the licensing.

Third, if you put something as large as io_uring_submit() into headers, all you're doing is bloating the end-user's binaries.

Imagine the user has 100 translation units and they use your header-only fork, each unit including the same liburing header-only headers. This would be an absurd amount of duplication.

In short:

the-moisrex commented 10 months ago

Kernel devs are aware of -pedantic and what UB is.

Yes, he knows; but I've waited long enough for a simple response from him, let alone addressing these issues; so I'm just resolving them the way that I know.

It means you're assuming that all software packages can be built under the same set of flags as your project's when the reality is, this doesn't work in industry or practice.

The opposite, the UBs and -pedantic errors ARE IN THE HEADERS; that means you can't include the library headers if you specify certain flags.

This is why you should respect the autotool workflow.

CMake is pretty de-facto too; I've got nothing against autotools either, but I NEED CMAKE TOO.

Second, making liburing header-only doesn't help with distribution because liburing is licensed however it's licensed. Source or compiled source doesn't really matter, you're limited here entirely by the licensing.

I'm not a lawyer, I don't know much about licenses, but I don't see why licenses would be an issue here since it's MIT, GPL, and LGPL; (I think I accidentally removed a couple of them that I should add back)

Imagine the user has 100 translation units and they use your header-only fork, each unit including the same liburing header-only headers. This would be an absurd amount of duplication.

  1. If user wants a single TU, then they can use the library in a TU and provide wrappers themselves.
  2. io_uring_submit and others are not that big, liburing.h is already big enough, it won't affect the compile time that much.
  3. I'm aiming for C++ as well.
  4. I have to consider C++20 modules as well because they're coming.

you haven't presented a valid use-case for liburing being header-only

I do need it header-only because my own library is header only for now; I'm not against a non-header-only version, you can open up an issue in there and let's discuss that; we can easily provide both versions if required without code duplication.

I believe you're having trouble accepting this is how systems libs are distributed

This is a library, other libraries have already started implementing io_uring themselves and they won't rely on liburing for that; and also C++ with its P2300 Signals and Receivers and its Networking TS, might need to use io_uring in its implementations soon enough, so you can't expect the standard library to rely on liburing for God sake.

So no, I consider liburing as a library, not a system library and certainly not a MUST-HAVE library.

you haven't thought enough about the ramifications of making everything inline

Open up an issue about this and let's talk about it.

stop trying to build liburing's source alongside your own

You have to build the header files at least, even that doesn't work in mine.

BenBE commented 10 months ago

Why does liburing need to have a header-only fork? What's the purpose here?

Because that's how open source works!

The freedom to be able to make a fork does not imply that this makes any sense or helps with the overall situation. To the contrary, most forks just made the situation worse; or were forgotten about quick enough to avoid any impacts …

After Heartbleed people were eager to fork OpenSSL. About 10 years in, there are 2 forks, that are either abandoned or barely maintained at all. If you stuck with the original OpenSSL back then you are far better of now even though the project took some time to clean up the accumulated heaps of its tirefire.

And other forks usually don't end well either. Success stories are rare.

If you want to build a local fork for yourself: Go ahead. If you want changes upstream: Convince upstream WHY it should take them. But just proclaiming "But mine is better and shinier" ain't gonna cut it.

Personal note: I once had someone try to fork a project of mine. First thing that person did was reformat all source and after that badly integrate some pending PRs I didn't yet came around to work on. And all that fork did was cause trouble upstream because of this reformatting work I had to redo all the work of integrating the other PRs, that I could have used otherwise (AFAIR there were even some bugs in how those PRs were integrated, i.e. they didn't even do a clean job).

Kernel devs are aware of -pedantic and what UB is.

Yes, he knows; but I've waited long enough for a simple response from him, let alone addressing these issues; so I'm just resolving them the way that I know.

Which issues are you referring to? Are there PRs/patches included with the reports? Were concerns and feedback you received on these reports addressed?

It means you're assuming that all software packages can be built under the same set of flags as your project's when the reality is, this doesn't work in industry or practice.

The opposite, the UBs and -pedantic errors ARE IN THE HEADERS; that means you can't include the library headers if you specify certain flags.

Can you point to issues/PR with reports of these? Can you show a concrete instance, where the UB actually misbehaves on a supported target platform?

That said: I usually aim for a quite strict set of compilation flags for my projects myself, thus having a library I use do similar is much appreciated. Can you point to specific issues you noticed that haven't been addressed yet?

This is why you should respect the autotool workflow.

CMake is pretty de-facto too; I've got nothing against autotools either, but I NEED CMAKE TOO.

Make a PR with a license-clean Findliburing.cmake file for distribution with this project and I think this can be added to its distribution.

NB: AFAIK there's a config file for pkgconf included, which makes use of this library a no-brainer in cmake anyway … ;-)

Second, making liburing header-only doesn't help with distribution because liburing is licensed however it's licensed. Source or compiled source doesn't really matter, you're limited here entirely by the licensing.

I'm not a lawyer, I don't know much about licenses, but I don't see why licenses would be an issue here since it's MIT, GPL, and LGPL; (I think I accidentally removed a couple of them that I should add back)

Which might make your fork non-compliant of the licenses …

Imagine the user has 100 translation units and they use your header-only fork, each unit including the same liburing header-only headers. This would be an absurd amount of duplication.

There are options with most modern compilers that will usually consolidate most duplicated functions into only one instance. Not as bad as implied, but given potential inlining not entirely impossible.

  1. If user wants a single TU, then they can use the library in a TU and provide wrappers themselves.

Or just use a statically built version of the library and be done with it.

  1. io_uring_submit and others are not that big, liburing.h is already big enough, it won't affect the compile time that much.

The great thing about static libraries is you only pull in what you need. There are even some compiler flags that can make including only part of a TU possible …

  1. I'm aiming for C++ as well.

I'm not sure, but as long as the extern "C" {…} guards are there, there's not too much trouble with C++ compat (apart from some syntax restrictions that apply for the common subset). No need for an entirely separate fork of the library.

  1. I have to consider C++20 modules as well because they're coming.

One of the few features I'm sure I'll not be using for a long time … PCH with C++ just don't work and C++20 modules won't change much in that regard.

I believe you're having trouble accepting this is how systems libs are distributed

This is a library, other libraries have already started implementing io_uring themselves and they won't rely on liburing for that; and also C++ with its P2300 Signals and Receivers and its Networking TS, might need to use io_uring in its implementations soon enough, so you can't expect the standard library to rely on liburing for God sake.

I don't see anyone here having trouble with this being the case. This implementation is a reference implementation that anyone is free to use, but nobody is forced to. If you feel the itch to run your own implementation, go ahead.


With that much said: I've seen enough header-only libs in both C and C++ (stb, Boost, …) that just aiming at header-only usually raises a red flag for me. And even single-file libraries (e.g. PH7) don't leave a good impression either …

I think, apart from the compliance to modern C/C++ standards that would be desirable, the current state of liburing is good as is; it's not perfect by any means, but there are far more pressing issues here than trying to fit everything into one header.

2ct

the-moisrex commented 10 months ago

But just proclaiming "But mine is better and shinier" ain't gonna cut it.

I'm trying to address issues that this library doesn't seem to address, that's all.

First thing that person did was reformat all source

I did a lot more than just reformat :), the whole file structure have changed.

Can you show a concrete instance, where the UB actually misbehaves on a supported target platform?

UB is UB for a reason, you don't need to SHOW its misbehave.

Can you point to issues/PR with reports of these?

There are others as well by other people who mentioned these things, they're closed now; but more importantly that these are the libraries like Nvidia's stdexec and others that instead of using this library, implemented their own; this is only going to spiral out of control if this library doesn't address the needs of C++ developers.

One of the few features I'm sure I'll not be using for a long time … PCH with C++ just don't work and C++20 modules won't change much in that regard.

They better get the job done because the amount of C++ templates used in the libraries are out of control and only a good C++20 modules can save it (I hope)

I don't see anyone here having trouble with this being the case. This implementation is a reference implementation that anyone is free to use, but nobody is forced to. If you feel the itch to run your own implementation, go ahead.

That's not the point I was making. The point was that this library is just a library, and the only reason why it's not being used, is because of it not being a good choice for C++ projects, specially those which need to be standard compliant.

With that much said: I've seen enough header-only libs in both C and C++ (stb, Boost, …) that just aiming at header-only usually raises a red flag for me. And even single-file libraries (e.g. PH7) don't leave a good impression either …

Header only is definitely a red flag for C libraries, but for C++, a lot of time it's just a necessity because of the templates and what not.

but there are far more pressing issues here than trying to fit everything into one header.

Header-only != single-header