Closed aubi1kenobi closed 1 year ago
The log "iouring:register_files is enabled" indicates that io_uring is probably initialized ok, and it supports registered files. But the following "free(): invalid pointer" doesn't give us a clue. Can you show us its call stack?
Maybe a simple doc with all the 'gotchas' of using iouring (or better, a specific iouring sample, client socket please (not files and not server))
using iouring is simply a selection of it.
iouring 'settings' facilieies/methods (sqes, poll, callbacks/eventfd, etc.)
they are all encapsulated as the implementation details
More importantly: a low latency setting possibilities (setting logging per default hurts latency, not everyone is throughput oriented).
what do you mean by 「logging per default」?
and we do use photon for low-latency scenarios already.
@aubi1kenobi There are some minor changes in the echo server example (examples/perf/net-perf.cpp
). Looks like you are not using the latest one. Would you update the code once more? I re-run it in Ubuntu 22 and it was OK.
To use io_uring socket, you can uncomment the lines of auto cli = photon::net::new_iouring_tcp_client();
and auto server = photon::net::new_iouring_tcp_server();
in examples/perf/net-perf.cpp
.
The default code will use non-blocking fd + io_uring poll + libc send/recv. However the iouring client/server you triggered on will use io_uring native send/recv + blocking fd.
The event engine should always be io_uring.
You may have noticed that there are two types of networking tests, i.e. streaming and ping-pong. The io_uring server performs well in the ping-pong mode, but the default type of server has better results in streaming mode. That is a known issues we have reported to the io_uring and kernel community.
Hi,
No logs generated other than these console msgs:
_``` 100%] Linking CXX executable xPhotonos [100%] Built target xPhotonos ~/xPhotonos/build$ sudo ./xPhotonos 2023/04/24 21:27:05|INFO |th=00005617C4AE75F0|/home/s200330/arc6/shops/PHV/1/photon/io/iouring-wrapper.cpp:653|new_iouring:Init event engine: iouring [is_master=1] 2023/04/24 21:27:05|INFO |th=00005617C4AE75F0|/home/s200330/arc6/shops/PHV/1/photon/io/iouring-wrapper.cpp:486|check_register_file_support:iouring: register_files is enabled free(): invalid pointer Aborted
- Register files: I'm using sockets, therefore was expecting to register buffers, not files
- Settings: There are several things that can be done to set the iorings, just like you are providing the `setsockopt` method to configure the sockets.
- Loging: my bad, you are not loging per default.
- IOuring settings: It's good to provide a simplified facade to the api as you have done. However, just like the "setsockopt", one can do 'multishots' cqe, if desired, register buffers instead of files, set IOPOLL, etc. There is no one size fits all, to abstract that away. If you do then at least have a chine wall implemenation for sockets, low latency, throughput, etc. As you know, throughput and low latency are often, mutually exclusive.
I like your product, and its probably the best api for me to use to get to production, in a low latency env, and network is a latency killer, that's why iouring is key.
The not so good part is that your own sample did not run with iouring, and that causes concerns as you can imagine. If the site said 'experimental' then I wouldn't even post about this, and just go look elsewhere. The selling point for me was/is (if it works) iouring and the comparison you did to libunifex and the likes.
Thanks. And sorry for the long post.
Obi
Just reposting part of the previous message, as the settings got a bit shambled.
Register files: I'm using sockets, therefore was expecting to register buffers, not files.
Settings: There are several things that can be done to set the iorings, just like you are providing the setsockopt
method to configure the sockets.
Loging: my bad, you are not loging per default.
IOuring settings: It's good to provide a simplified facade to the api as you have done. However, just like the "setsockopt", one can do 'multishots' cqe, for example, if desired, register buffers instead of files, set IOPOLL, etc.
There is no one size fits all, to abstract that away. If you do then at least have a China wall implemenation for sockets, low latency, throughput, etc. As you know, throughput and low latency are often, mutually exclusive.
I like your product, and its probably the best api for me to use to get to production, in a low latency env (at least, I hope), and network is a latency killer, that's why iouring is key.
The not so good part is that your own sample did not run with iouring, and that causes concerns as you can imagine. If the site said 'experimental' then I wouldn't even post about this, and just go look elsewhere. The selling point for me was/is (if it works) iouring and the comparison you did to libunifex and the likes.
Thanks. And sorry for the long post.
Obi
Of course it's not experimental. Alibaba is one of the core contributors of the io_uring community. Even though we don't belong to the kernel team (but the storage), we have had long-time cooperations with the io_uring related kernel teams.
We have also reported many bugs to the kernel. Some of them merged into the mainline. But in order to take that into production earlier, we would back-port these fixes into our own kernel. We are quite confident of the project quality.
https://github.com/axboe/liburing/issues/825#issuecomment-1468527870
You may use the add_interest
and rm_interest
of the CascadingEventEngine
, see some test code test-iouring.cpp
, TEST(event_engine, cascading_one_shot)
@aubi1kenobi There are some minor changes in the echo server example (
examples/perf/net-perf.cpp
). Looks like you are not using the latest one. Would you update the code once more? I re-run it in Ubuntu 22 and it was OK. @beef9999, Hi, thanks, I believe I cloned the repo about a week or two, and I ran the c++coro echo server, not the net-perf. Does that make a difference? Anyway, I will try and get back to you.The default code will use non-blocking fd + io_uring poll + libc send/recv. However the iouring client/server you triggered on will use io_uring native send/recv + blocking fd.
The event engine should always be io_uring.
You may have noticed that there are two types of networking tests, i.e. streaming and ping-pong. The io_uring server performs well in the ping-pong mode, but the default type of server has better results in streaming mode. That is a known issues we have reported to the io_uring and kernel community.
I didn't get this.
You may go to the front page and click this button. It describes the differences of streaming and ping-pong.
The event engine allows the coroutine to sleep, to block, to schedule, or to poll file descriptors. As long as your kernel satisfies, you should always use the io_uring event engine, which is set by photon::init
. The alternative is epoll. It can poll any fd as well.
In terms of io_uring, there are two types of tcp socket implementations. For instance, the client, the first one is photon::net::new_tcp_socket_client
(default one) , and the other is photon::net::new_iouring_tcp_client
.
The former equals to non-blocking fd + io_uring/epoll poll + libc send/recv
. The latter equals to io_uring send/recv + blocking fd
. Because io_uring in the new kernel has enabled the FAST_POLL feature by default, so we don't need to poll any more.
The former is good at working in streaming network. The latter is the best choice in case of ping-pong and huge number of connections.
Remember to add -D ENABLE_URING=1
when building with CMake
https://github.com/alibaba/PhotonLibOS#3-examples--testing
When using new_tcp_socket_client (the default one), io_uring is used as an event poller (similar to epoll).
When using new_iouring_tcp_client, io_uring is used to perform async sending and receiving.
Hi guys,
I recloned the photon and liburing yesterday.
Every io_uring 'test-x' reported that I did not have enough mlock resources and should run 'ulimit -l unlimited', but that doesn't work for a normal user.
Running the tests as 'root' fails with:
**test-iouring/client/server/x
: liburing.so.2: version LIBURING_2.2 not found**
Being a latest update, my iouring is now at 2.4. Can't think that downgrading is the solution. Could you please advise? I seem to be stuck at using iouring.
Thanks. Obi
the current cmake logic will not download liburing source code if you have installed it systemwide. you can try to delete the installation temperialiy
Hi guys, you guys seem to have lots of experience in the tcp area. In your opinion, will the latency gains, if any, made on using iouring be considerable, in comparison to libaio? Or is it marginal? I know, the linux kernel (or any other sw kernel, will always be the bottleneck), any suggestions with your package for achieving the lowest possible latency? I'm stuck with sw on this project, no fpga., and iouring seems difficult to get right, as all packages i've used show, and now, unfortunately, no luck yet with yours as well.
Obi
Could you please quantify your problem, and specify the code?
We don’t know how to proceed this conversion without them
any suggestions with your package for achieving the lowest possible latency?
You may want to checkout SMC-R . It is RDMA wrapped as stream socket with a set of APIs similar to TCP. We can release a wrapper of it very soon, if you are interested.
@beef9999 Cheers, uninstall of liburing + recompil of your package seem to have solved all the issues I've had so far with iouring. All tests ran ok.
@lihuiba Cheers, had a quick look at the provided link and it really does sound good. Yes I'm interested, quite so indeed. What do you reckon the ETA is for your wrapper?
@aubi1kenobi
Photon 0.6 is still under evaluation. This is a pre-release Pull Request and you may check out its code.
As long as your kernel version is greater than 4.x and you set up the photon SMC-R socket wrapper by new_smc_socket_client
and new_smc_socket_server
, you code will simply work.
Of course a RDMA NIC is required.
@beef9999 Is the Solarflare/Xininx/AMD X255 an RDMA NIC? Or do you have a suggestion? Btw, once I solved the initial issue, txs to you and @lihuiba , your package has been satisfactory. I leave the latency improvement to last. Need to get to the end first. I have one or two more questions, but I will close this issue, and open a new one for these other questions. Txs a bunch for your support.
System: HP, ubuntu 22, kernel 6.2, photon: latest git clone.
Running your sample echo_server.cpp coroutine sample, and exchanging the 'INIT_EVENT_EPOLL' with 'INIT_EVENT_IOURING' fails with the following error:
[INFO] .../io/iouring-wrapper.cpp:486|check_register_file_support:iouring:register_files is enabled free(): invalid pointer I need tis quite urgently, I thought this code was production ready, as per your website.
General comments
Thank you, I need to get to production.
Obi