jetty / jetty.project

Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more
https://eclipse.dev/jetty
Other
3.83k stars 1.91k forks source link

[Regression] 12.0.7 breaks junixsocket-jetty due to Selector logic change #11500

Open kohlschuetter opened 6 months ago

kohlschuetter commented 6 months ago

Jetty version(s) 12.0.7

Jetty Environment any

Java version/vendor any, e.g. openjdk version "21"

OS type/version any, e.g. macOS

Description commit 24c1140917d32222d5d6eb022f9d7e6096d485e0 ("Fixes #8979 - Jetty 12 - HttpClientTransport network "modes". (#11368)") introduced a breaking change compared to 12.0.6.

junixsocket-jetty-12 now has failing tests (a junixsocket Selector tries to register a key on a vanilla Java / non-junixsocket SocketChannel, which is an unexpected situation).

In issue #8979, @sbordet writes:

The combo [ TCP|UDP / IP|UNIX] can share a SelectorManager, while LOCAL does not need it.

I guess that assumption does not hold true for custom Socket implementations like junixsocket's ...

How to reproduce? check out https://github.com/kohlschutter/junixsocket/tree/main/junixsocket-jetty-12 modify its POM (change to 12.0.7 or 12.0.7-SNAPSHOT) mvn clean install

[INFO] Running org.eclipse.jetty.unixdomain.server.UnixDomainTest
[ERROR] Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 0.410 s <<< FAILURE! -- in org.eclipse.jetty.unixdomain.server.UnixDomainTest
[ERROR] org.eclipse.jetty.unixdomain.server.UnixDomainTest.testHTTPOverUnixDomainWithProxyProtocol -- Time elapsed: 0.312 s <<< ERROR!
java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: Unsupported channel: java.nio.channels.SocketChannel[connection-pending remote=localhost/[fe80:0:0:0:0:0:0:1%1]:80]
    at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
    at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
    at org.eclipse.jetty.client.transport.HttpRequest.send(HttpRequest.java:707)
    at org.eclipse.jetty.unixdomain.server.UnixDomainTest.testHTTPOverUnixDomainWithProxyProtocol(UnixDomainTest.java:256)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
Caused by: java.lang.UnsupportedOperationException: Unsupported channel: java.nio.channels.SocketChannel[connection-pending remote=localhost/[fe80:0:0:0:0:0:0:1%1]:80]
    at org.newsclub.net.unix.AFSelectionKey.<init>(AFSelectionKey.java:48)
    at org.newsclub.net.unix.AFSelector.register(AFSelector.java:68)
    at java.base/java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:236)
    at org.eclipse.jetty.io.ManagedSelector$Connect.update(ManagedSelector.java:946)
    at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:575)
    at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:542)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produceTask(AdaptiveExecutionStrategy.java:512)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:258)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(AdaptiveExecutionStrategy.java:195)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:971)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1201)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1156)
    at java.base/java.lang.Thread.run(Thread.java:1583)

[ERROR] org.eclipse.jetty.unixdomain.server.UnixDomainTest.testHTTPOverUnixDomain -- Time elapsed: 0.013 s <<< ERROR!
java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: Unsupported channel: java.nio.channels.SocketChannel[connection-pending remote=localhost/[fe80:0:0:0:0:0:0:1%1]:1234]
    at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
    at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
    at org.eclipse.jetty.client.transport.HttpRequest.send(HttpRequest.java:707)
    at org.eclipse.jetty.unixdomain.server.UnixDomainTest.testHTTPOverUnixDomain(UnixDomainTest.java:163)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
Caused by: java.lang.UnsupportedOperationException: Unsupported channel: java.nio.channels.SocketChannel[connection-pending remote=localhost/[fe80:0:0:0:0:0:0:1%1]:1234]
    at org.newsclub.net.unix.AFSelectionKey.<init>(AFSelectionKey.java:48)
    at org.newsclub.net.unix.AFSelector.register(AFSelector.java:68)
    at java.base/java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:236)
    at org.eclipse.jetty.io.ManagedSelector$Connect.update(ManagedSelector.java:946)
    at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:575)
    at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:542)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produceTask(AdaptiveExecutionStrategy.java:512)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:258)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(AdaptiveExecutionStrategy.java:195)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:971)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1201)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1156)
    at java.base/java.lang.Thread.run(Thread.java:1583)

[ERROR] org.eclipse.jetty.unixdomain.server.UnixDomainTest.testHTTPOverUnixDomainWithHTTPProxy -- Time elapsed: 0.013 s <<< ERROR!
java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: Unsupported channel: java.nio.channels.SocketChannel[connection-pending remote=localhost/[fe80:0:0:0:0:0:0:1%1]:4567]
    at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
    at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
    at org.eclipse.jetty.client.transport.HttpRequest.send(HttpRequest.java:707)
    at org.eclipse.jetty.unixdomain.server.UnixDomainTest.testHTTPOverUnixDomainWithHTTPProxy(UnixDomainTest.java:199)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
Caused by: java.lang.UnsupportedOperationException: Unsupported channel: java.nio.channels.SocketChannel[connection-pending remote=localhost/[fe80:0:0:0:0:0:0:1%1]:4567]
    at org.newsclub.net.unix.AFSelectionKey.<init>(AFSelectionKey.java:48)
    at org.newsclub.net.unix.AFSelector.register(AFSelector.java:68)
    at java.base/java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:236)
    at org.eclipse.jetty.io.ManagedSelector$Connect.update(ManagedSelector.java:946)
    at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:575)
    at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:542)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produceTask(AdaptiveExecutionStrategy.java:512)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:258)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(AdaptiveExecutionStrategy.java:195)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:971)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1201)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1156)
    at java.base/java.lang.Thread.run(Thread.java:1583)

[ERROR] org.eclipse.jetty.unixdomain.server.UnixDomainTest.testLargeBody -- Time elapsed: 0.015 s <<< ERROR!
java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: Unsupported channel: java.nio.channels.SocketChannel[connection-pending remote=localhost/[fe80:0:0:0:0:0:0:1%1]:1234]
    at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
    at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
    at org.eclipse.jetty.client.transport.HttpRequest.send(HttpRequest.java:707)
    at org.eclipse.jetty.unixdomain.server.UnixDomainTest.testLargeBody(UnixDomainTest.java:347)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
Caused by: java.lang.UnsupportedOperationException: Unsupported channel: java.nio.channels.SocketChannel[connection-pending remote=localhost/[fe80:0:0:0:0:0:0:1%1]:1234]
    at org.newsclub.net.unix.AFSelectionKey.<init>(AFSelectionKey.java:48)
    at org.newsclub.net.unix.AFSelector.register(AFSelector.java:68)
    at java.base/java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:236)
    at org.eclipse.jetty.io.ManagedSelector$Connect.update(ManagedSelector.java:946)
    at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:575)
    at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:542)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produceTask(AdaptiveExecutionStrategy.java:512)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:258)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(AdaptiveExecutionStrategy.java:195)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:971)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1201)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1156)
    at java.base/java.lang.Thread.run(Thread.java:1583)
sbordet commented 6 months ago

@kohlschuetter the problem is caused by the fact that before 12.0.7, the SelectableChannel was created via

ClientConnector.connect()
  Configurator.newChannelWithAddress()

You have overridden newChannelWithAddress() in AFSocketClientConnector, but this approach had the same issues that Jetty had, namely that an HttpClient configured so could only connect to 1 Unix-Domain server.

In Jetty 12.0.7, the SelectableChannel is now created via

ClientConnector.connect()
  Transport.newSelectableChannel()

The Transport instance is created when the Destination instance is created; if the Transport is not explicitly configured in the Request, then it is created by asking the ClientConnector which in turn asks the Configurator, for backwards compatibility.

Configurator.newChannelWithAddress() is not called anymore.

You can Configurator.newTransport() in AFSocketClientConnector and from there return the appropriate instance of SocketChannel. However, you will still suffer from the fact that HttpClient could only connect to 1 Unix-Domain server.

Or, you can embrace the new model and create your own Transport implementation, similar to Transport.TCPUnix.

sbordet commented 6 months ago

Out of curiosity, did you try to implement a custom SelectorProvider, rather than explicitly call your APIs everywhere?

In this way, a call to SocketChannel.open() would return an instance of your implementation automatically, rather than having to explicitly call your APIs to replace every call site where SocketChannel.open() and the like are called.

I'd imagine that by using a custom SelectorProvider the code would just run without any modifications, using your implementation when necessary.

kohlschuetter commented 6 months ago

Thanks for your quick reply, @sbordet!

Out of curiosity, did you try to implement a custom SelectorProvider, rather than explicitly call your APIs everywhere?

Yes, junixsocket has its own SelectorProvider, and I think there's some jetty code that assumes it does not a have custom one.

I have a working workaround now, indeed using a custom Transport subclass (my ClientConnector now returns a custom transport upon ClientConnector.newTransport().

I'm working on integrating the fix into junixsocket as we speak.

The problem I see here, however, is that the above commit in question introduced an unexpected breaking change in a patch-level release (12.0.x). ClientConnector.newTransport() is a method newly introduced in 12.0.7., and without overriding this method, null is returned, which seems to mean "use the vanilla SelectorProvider", not "use the provider from the SocketChannel in ChannelWithAddress".

I wonder if such changes should automatically trigger a minor version bump (12.0.6 -> 12.1.0)?

If the breakage was undesired, can we find a fix that is backwards compatible?

sbordet commented 6 months ago

Jetty does not directly use the default SelectorProvider, it always calls the JDK APIs that leverage it, so I don't think your assumption that Jetty bypasses the SelectorProvider holds.

The problem is that you integrated with a very low level part of Jetty that is somehow internal, but it cannot really be for technical reasons (Java visibility etc.). Your use case is pretty unique.

Unfortunately, to fix the Unix-Domain issue we had to change these internals, and you have been affected.

We believe all other use cases where a SelectorProvider is properly used are not affected.

My point being that the exception is thrown by your code, not Jetty's, because it receives an unexpected SocketChannel implementation. But Jetty uses the standard SocketChannel.open() API to create them, so if you replaced the SelectorProvider it should be your classes, not the JDK ones.

Perhaps that test runs without the custom SelectorProvider, and just happened to work before?

For example, you should not need to do this: https://github.com/kohlschutter/junixsocket/blob/main/junixsocket-jetty/src/main/java/org/newsclub/net/unix/jetty/AFSocketClientConnector.java#L66

This also looks suspicious? https://github.com/kohlschutter/junixsocket/blob/main/junixsocket-jetty/src/main/java/org/newsclub/net/unix/jetty/AFSocketClientConnector.java#L77

kohlschuetter commented 6 months ago

The two lines you referred two are indeed how I made jetty compatible with junixsocket.

Now with 12.0.7 only one of them is called (the first one, responsible for the SelectorProvider), which means that there is a mismatch.

The culprit is here (if (transport == null)) https://github.com/jetty/jetty.project/blob/369d9f7e2f35720634d07d757149771821293cf5/jetty-core/jetty-client/src/main/java/org/eclipse/jetty/client/transport/HttpClientTransportDynamic.java#L216-L225

in combination with https://github.com/jetty/jetty.project/blob/369d9f7e2f35720634d07d757149771821293cf5/jetty-core/jetty-io/src/main/java/org/eclipse/jetty/io/ClientConnector.java#L642-L650

sbordet commented 6 months ago

I do not understand.

The snippets you refer to are just creating a Transport instance.

The Transport, in turn, uses the JDK APIs to create a SocketChannel, using the SelectorProvider.

If you have correctly replaced the SelectorProvider with your implementation, then your custom SocketChannel will be created, and the exception you reported won't happen.

kohlschuetter commented 6 months ago

See my changes to junixsocket-jetty above how I restored compatibility with all Jetty versions. It was definitely not straightforward but it works now.

The problem with the SelectorProvider becomes clear: Even with the new Transport API, the SelectorManager is configured elsewhere (in ClientConnector), without having custom Transport implementations in mind: as soon as you have an incompatible pair of Transport and ClientConnector, you will get the error.

It would be great if I could use my custom Transport class without having to specify a custom ClientConnector on top of it. Currently, this is impossible because the default ClientSelectorManager will use the default SelectorProvider.

sbordet commented 6 months ago

@kohlschuetter I don't understand.

Did you replace the default SelectorProvider with your custom one? I mean by specifying a system property or service file? If so, there should be no problems.

If you have not replaced the default SelectorProvider, then you have to change every call site that uses the JDK Channel APIs. If you have not, any reason not to?

kohlschuetter commented 6 months ago

No, junixsocket does not replace the default SelectorProvider singleton that can be retrieved via SelectorProvider.provider().

junixsocket provides its own implementation that only works with its own socket implementations, that support protocol families such as AF_UNIX, AF_TIPC, AF_VSOCK, etc. See AFUNIXSelectorProvider, for example.

The main reasons we don't override the default SelectorProvider at the system-level (via the system property java.nio.channels.spi.SelectorProvider or via SPI) are simplicity, stability, and portability. We don't change other defaults for an entire VM for, say, file encodings, just because we want to read a file with a custom text encoding. The same applies for sockets.

The standard Java API makes no guarantees that there is only a singleton SelectorProvider. In fact, the opposite is true: SelectableChannel has a method to retrieve the corresponding SelectorProvider (#provider()). Why? Because even the Java Standard library authors assume that there is a better way than replacing the default SelectorProvider...

sbordet commented 6 months ago

Ok, so you don't replace the SelectorProvider.

Then, how can Jetty know how to create your own SocketChannel instances?

Either you have to give Jetty a custom ClientConnector, or you give a custom Transport.

IIUC, before it was working because you had a custom ClientConnector.

Note that the custom ClientConnector for Unix-Domain was (and is) the wrong solution, as it restricts the client to connect only to 1 server (in your tests, the whole ClientConnector is configured with only 1 Unix-Domain address).

My understanding is that you must have a custom ClientConnector (to replace the Selector implementation). And you must have custom Transport for Unix-Domain -- looking at your AFSocketTransport it always requires an AFSocketAddress so you must explicitly set it for every request.

Do you have cases where you use your custom ClientConnector, but do not specify a Transport for the Request?

kohlschuetter commented 6 months ago

Either you have to give Jetty a custom ClientConnector, or you give a custom Transport.

Yes, with 12.0.7, I currently need both a custom ClientConnector and Transport, otherwise we'll get the aforementioned exceptions.

With the new Transport concept, I wonder what purpose ClientConnector now still has? Shouldn't it be tightly connected to Transport? (especially that we would need to be able to specify a custom SelectorManager per Transport)

That would allow me to share the same client for both TCP-related requests and junixsocket-related ones.

Do you have cases where you use your custom ClientConnector, but do not specify a Transport for the Request?

Only for jetty < 12.0.7 compatibility

And you must have custom Transport for Unix-Domain -- looking at your AFSocketTransport it always requires an AFSocketAddress so you must explicitly set it for every request.

Having just one SocketAddress is not a big deal for my use case. I'm serving HTTP over (for example) AF_UNIX, AF_TIPC and AF_VSOCK, at the protocol level replacing TCP/IP, but still using regular hostnames to identify the target. We can resolve the target host at the HTTP level (Host:).

Currently, thanks to ClientConnector#newTransport I'm still able to forego specifying the AFSocketTransport for each request. That method is marked @Deprecated(forRemoval) now, which makes me wonder where else I would be able to specify a default transport for my HttpClient?

sbordet commented 6 months ago

Just to be clear: you need a custom ClientConnector and custom Transport classes, but you should be able to give to the HttpClientTransport only the custom ClientConnector -- there is no need to explicitly specify a custom Transport for every request. This is because you will override ClientConnector.getTransport().

I wonder what purpose ClientConnector now still has?

ClientConnector manages the network related resources. Before Jetty 12.0.7, you had to have 1 ClientConnector for SocketChannels, 1 for Unix-Domain SocketChannels, and 1 for DatagramChannel, separated. With Jetty 12.0.7, you can have a single ClientConnector that manages all of the above. For example, before 12.0.7 you could not use the same HttpClient instance make requests over HTTP/2 and HTTP/3. With 12.0.7 this is now possible.

especially that we would need to be able to specify a custom SelectorManager per Transport

See above, we don't want to do that. We want to be able to specify a SelectorManager that is able to handle all network Transports.

Having just one SocketAddress is not a big deal for my use case. I'm serving HTTP over (for example) AF_UNIX, AF_TIPC and AF_VSOCK, at the protocol level replacing TCP/IP, but still using regular hostnames to identify the target. We can resolve the target host at the HTTP level (Host:).

Not sure I follow. If you resolve server1 and that server1 has a Unix-Domain address at /var/run/server1.sock, and your HttpClient is configured with just that Unix-Domain address, then you need a different HttpClient instance if you want to communicate with server2 with Unix-Domain address /var/run/server2.sock. If that's fine for you, great. What Jetty 12.0.7 allows you to do now is to use the same HttpClient instance for both Unix-Domain addresses, provided you explicitly specify a Transport in the Request.

where else I would be able to specify a default transport for my HttpClient?

This is a good point. Right now it is hardcoded using Transport.TCP_IP, but obviously for you this does not work, as you need your custom default Transport.

I may add ClientConnector.getDefaultTransport(Transport candidate) or similar. The idea would be to pass the currently hardcoded instance, so that a ClientConnector subclass would be able to convert it to a custom one. For example, HTTP/2 would pass in Transport.TCP_IP and you would return your AFSocketTransport.WithSocketChannel() -- ah, no you apparently always need an AFSocketAddress, but no idea where to pull that from.

Can you please open a new issue about this? I would like also to hear your proposal about solving this "default Transport" issue, since you know your code (and what it would need) much better than I do. Thanks!

kohlschuetter commented 6 months ago

especially that we would need to be able to specify a custom SelectorManager per Transport

See above, we don't want to do that. We want to be able to specify a SelectorManager that is able to handle all network Transports.

I see a problem with that because (as seen) you cannot guarantee that any SelectorProvider can work with any SelectableChannel.

Perhaps you could change ManagedSelector[] _selectors; such that it is grouped by SelectorProvider, i.e., something like Map<SelectorProvider, ManagedSelector[]> _selectorsByProvider), and change the corresponding logic in accept/connect etc. from chooseSelector() to chooseSelector(channel.provider());