helidon-io / helidon

Java libraries for writing microservices
https://helidon.io
Apache License 2.0
3.51k stars 565 forks source link

4.0.6: Chunked transfer no longer sends data over network on Flush (MP) #9183

Open JasonShattu opened 2 months ago

JasonShattu commented 2 months ago

Environment Details


Problem Description

Creating a Rest endpoint in java that attempts to chunk back results to a client, doesn't come back chunked but comes back with one single result. Using Helidion 4.0.6 this problem is easily reproducible.

Steps to reproduce

Step:

Compile Programs like the following and run in Helidon Mp contained 4.0.6:-

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import jakarta.ws.rs.core.Response;
import jakarta.ws.rs.core.StreamingOutput;

import java.io.BufferedWriter;
import java.io.OutputStreamWriter;
import java.io.Writer;

@Path("/test/")
@ApplicationScoped
public class StreamResource {

    @GET
    @Path(value = "/stream")
    @Produces(MediaType.TEXT_PLAIN)
    public Response stream() {

            StreamingOutput stream = output -> {
                Writer writer = new BufferedWriter(new OutputStreamWriter(output));
                for (int i = 0; i < 10; i++) {
                    writer.write("This is the " + i + "th input for the result stream." + "\n");
                    writer.flush();
                    try {
                        Thread.sleep(2000);
                    } catch (InterruptedException e) {
                        throw new RuntimeException(e);
                    }
                }

            };
            return Response.ok(stream).build();
        }

}

call endpoint from client, like the following: curl -N -v "http://localhost:8080/api/v1/test/stream"

pre 4.0.6, results will comeback in chunks, one line at a time, 4.0.6 and above results come back in one single chunk

JasonShattu commented 2 months ago

Maybe related to this change according to @barchetta

tomas-langer commented 2 months ago

Just to make this clear - chunked transfer is not broken. Jersey just uses a buffering algorithm that cannot be overridden using flush().

Looking into the problem.

tomas-langer commented 2 months ago

So I have identified that the problem is actually within Jersey. It's concept of buffering ignores flush operations. When I try to implement buffering in Helidon (that honors flush), Jersey still causes the same problem, as flush is called always before close through org.glassfish.jersey.message.internal.OutboundMessageContext#close

final OutputStream es = getEntityStream();
es.flush();
es.close();

So we have a choice:

JasonShattu commented 2 months ago

@tomas-langer what are the advantages and disadvantages of each?

tomas-langer commented 2 months ago

"ignore flush operations" - we can buffer any response from Jersey, and for smaller responses (that fit into the buffer, which is 4KB by default) create a Content-Length header, and send the data as a single buffer. This improves speed for smaller entities, and uses less resources to send the data (and to parse it on the other side), as there is no chunked transfer encoding.

"honor flush operations" - we can flush over the network based on user's choice, but it will always use chunked transfer encoding, unless content-length is specified by the user

There is a third option - change the code in Jersey, to allow us to handle caching as Helidon wants (discussing this with Jersey authors right now)

tomas-langer commented 2 months ago

Jersey PR: https://github.com/eclipse-ee4j/jersey/pull/5736 Once Jersey is released, I will re-implement this to support proper buffering and flushing at the same time.

tomas-langer commented 3 weeks ago

Requires Jersey 3.1.9 which should be now released.

tomas-langer commented 2 weeks ago

So after trying with latest Jersey:

In OutboundMessgeContext the entityStream is always set to CommittingOutputStream because of this code in OutboundMessageContext:

    /**
     * Set a stream provider callback.
     * <p/>
     * This method must be called before first bytes are written to the {@link #getEntityStream() entity stream}.
     *
     * @param streamProvider non-{@code null} output stream provider.
     */
    public void setStreamProvider(StreamProvider streamProvider) {
        committingOutputStream.setStreamProvider(streamProvider);
    }

As a result, the instanceof can never be true, as the instance type is always CommittingOutputStream and not our implementation...