Open juanazam opened 1 year ago
Any updates here? How do we get this merged? Anxiously awaiting this since it's the last thing stopping use from going to OTP 26.
@Billzabob I'm not sure to be 100% honest.
@lexmag Sorry to ping you directly (I know you haven't been active on this project for a bit), but is there any chance we can get this reviewed, merged, and released?
@lexmag Bump
If @lexmag has abandoned this project, does anybody want to take it over, either by getting the repo transferred or creating a canonical fork?
Any updates on this @lexmag? This is a blocker for us moving to OTP 26 as well
This is blocking us. How can we get this published or migrated to somebody who will publish?
Thanks
Since this is blocking many of us, the company I'm working on, Cabify, could try to take the temporal ownership of this repo under this fork: https://github.com/cabify/statix. We've been running this solution in production for some weeks with success.
If you are ok with this, please, react to this message and open the same PR in the fork and we will be glad to merge it and publish the library under a temporal name cabify_statix
.
We want to make explicit that, in case the owner is back, we will just get rid of the fork and sync it with this repo. So if you agree, let's continue and adopt OTP 26! If you are not ok with this proposal, just let us know also 😁
@carrascoacd Thanks for testing the fix and taking ownership of the package!
Before continuing, please @whatyouhide since you appear as the maintainer of the library. Could you share your thoughts?
Update here, I have reached Andrea and he has not been involved with the project for a while, he suggested reaching for @lexmag, now I'm trying to get a hold of him.
Hi folks, I'm sorry to have kept you waiting on this PR.
@juanazam thank you again for reaching out.
First of all, I must say that Statix does not indeed get a high priority among all activities I have to address, but I still do want it to be fixed and improved, thus, during next two weeks I'm going to merge this PR, look into other issues, and level the project up in general.
That's said, if anyone is willing to push the work beyond what I'm able to dedicate this time, nothing stops you from doing that considering the license is extremely permissive.
As for this PR, I first need to read up on what specifically has changed in OPT 26. From a quick glance, the PR unfortunately cannot be merged in its current shape as it drastically affects the performance of OTP pre-26. We need to have a conditional compilation to preserve the previous approach for older OTP versions.
Hi @lexmag,
Thanks for taking a look at this. Regarding older versions of OTP, I didn't want to spend a lot of time on it without knowing what's the plan on maintenance for them. I saw some checks for OTP 18 compatibility, which was released in June 2015. Maybe it's work considering doing a major release and remove support for older OTP versions to keep the code simpler (this is just an idea).
If you know what's the oldest version you would want to support and have ideas on how to support it, I can help with updating this PR.
Hey there, kindly asking what the ETA on a release of this might be? 😃
Hey @lexmag do you need any help with this?
From a quick glance, the PR unfortunately cannot be merged in its current shape as it drastically affects the performance of OTP pre-26. We need to have a conditional compilation to preserve the previous approach for older OTP versions.
Does anyone know specifically what @lexmag is referring to here? I don't immediately see anything in this change that sets up alarm bells performance wise...
@mtrudel I can give some context there. It seems that using :gen_udp.send
instead of Port.command
was less performant on older versions of OTP.
From the related OTP thread https://github.com/erlang/otp/issues/7130#issuecomment-1513005012
In epgsql direct port call was implemented (the idea was borrowed from RabbitMQ) because epgsql tends to accumulate a large message queue (due to active=true by default). And pre-OTP26 gen_tcp:send had a performance issues when the process calling it has a long message queue.
@mtrudel I can give some context there. It seems that using :gen_udp.send instead of Port.command was less performant on older versions of OTP.
Thanks for the context @juanazam!
Unless I'm reading this wrong, the way statix currently uses Port.command
nets to the exact same thing that RabbitMQ's hack seeks to avoid. To wit:
Port.command
specifically to get around the fact that :gen_tcp.send
selectively receives on a {:inet_reply,...}
tuple to hear back from the port. {:inet_reply,...}
tuple at the time you want to send data to a port. With this in hand, let's look at how statix does this today:
Port.command
instead of gen_udp.send
. Cool. We've avoided a selective receive so far!gen_udp.send
, but it loses all of gen_udp
's careful handling of edge cases, and even more importantly, it uses private APIs to do so.Again, unless I'm misunderstanding this, statix's implementation here amounts to an 'adhoc implementation of half of gen_tcp.send' (with apologies to Robert Virdig). There's no point in trying to worry about preserving what's here for older OTPs because it's been slow this whole time.
What am I missing here?
@mtrudel Everything you mention here makes sense! I tried digging up discussions on old issues but couldn't find anything meaningful. The Port.command(sock, packet)
call was added as part of the initial commit, so it has always been there.
I remember seeing a discussion about this somewhere, but I can't find it (this was months ago so my memory is failing me), sorry!
If the performance problems only apply to older versions, we could add a conditional compilation as suggested, ie OTP < 26. It should work. It should unblock this while we find how to improve it.
Decided to implement backwards compat as @carrascoacd mentioned to make this even less of a concern. It's not the cleanest implementation but IMO this path will be eventually removed, so no need to keep it too fancy.
@mtrudel while implementing it, I confirmed one thing! I had to keep this
Because I was getting this error while running the tests:
Unexpectedly received message {:inet_reply, #Port<0.13>, :ok} (which matched _any)
I think that back then I read that Port.command
immediately replied to the caller process with this type of message, but I can't seem to find the thread were I read that :(.
Anyway I hope this makes it easier for @lexmag to get it merged
One more finding on why this snippet was needed:
It seems it was happening because of an internal detail on the prim_inet
implementation that was removed at the same time as the braking change was added.
There is a flow where the process running Port.command
receives a reply, and that's why the error happens and the receive block is needed.
There is a flow where the process running
Port.command
receives a reply, and that's why the error happens and the receive block is needed.
Yes, but my point stands: this implementation as it exists today is just an adhoc copy of what :gen_udp.send
does; we're not actually saving any effort here. The whole point of the RabbitMQ hack is to avoid a receive
; if we put it back then we're no better off than we are just calling :gen_udp.send
and doing it the right way.
I think we are ready to merge, right? cc/ @lexmag
Hey all, any movement here?
same here, some fix?
im using docker
FROM hexpm/elixir:1.16.3-erlang-26.2.5.2-ubuntu-jammy-20240808
Why is this still in progress????
Closes #69
Update to Elixir 1.15.2 and OTP 26.0.2 The main issue here is that
Port.comand
is erroring out because of a behavior change on OTP that's not going to be updated.Related issue on Statix: https://github.com/lexmag/statix/issues/69 Related issue on top: https://github.com/erlang/otp/issues/7130
In summary, Statix was using
Port.command
to avoid a performance issue with using:gen_udp.send
. This approach no longer works because of OTP updates happening at the networking layer.In order to avoid this, this PR uses
gen_udp.send
(which has been optimized), but with this change, a lot of the existing code is no longer useful.Port.command
expects a packet containing the IP and port of the destination to which the packet will be sent. This is no longer required with the new approach, but we need to keep the address and port on theConn
module to be able to do the:gen_udp.send
calls to send Statds servers.This patch has been tested in production, see: https://github.com/knocklabs/statix/pull/1