joatmon commented 6 years ago

See comment in: Support of event based (non-blocking) request processing. #549

Duplicated here: Hi,

My team has been using Spark as a platform to build microservices for the last year. We have one service that is still running on Spring Boot that we would like to port to Spark, but it makes heavy use of Asynchronous Requests. We've reviewed the PR from mj1618, and don't think that it would work for our use case because we also need the ability to stream back very very large responses.

So we have come up with an alternative PR that we have tested with our application (at transaction rates of up to 3000 requests per second)

TL;DR

This PR:

adds support for Asyncronous Request handing using any Java construct you wish (threads, futures, ...)
has no impact on existing Spark applications
has a minimal impact on the Spark code base
supports both Spark style request API's and traditional Servlet 3.0 style API's
supports Spark After and AfterAfter filters.

USE CASE

I'll use our service as an example of why asyncronous request handling is needed. Our service basically exposes a set of REST API's that can be used to issue queries against any of several backend databases. The user of the REST API can submit any query they wish, so we have no idea how long the query will run or how much data it will return. It could run for 1ms and return 1 row of data or it could run for 2 hours and return a million rows of data.

In typical syncronous request processing, these queries would be running in the Jetty request thread. So if enough long running queries are submitted, we will run out of request threads. 🙁

So instead we use asyncronous requests which free up the Jetty request thread once we have put the request on an internal queue. Then we have our own thread pool that executes database queries from the internal queue and then streams the results back to the client.

Because we are potentially returning so much data, we cannot keep all of the response body in memory at one time. So we need to be able to stream the response directly to the output stream of the HTTP response. This is the part that doesn't work well with the existing Spark API's because they store the response body in string.

EXAMPLE: Spark API's

Here is a simple example of a Spark Controller method that creates its own thread to use to process the request asyncronously:

        get("/async", (req, res) -> {
            final AsyncContext ctx = req.startAsync();
            Thread t = new Thread(() -> {
                try {
                    Thread.sleep(500);
                    res.body("Hello from Async!");
                    ctx.complete();
                }
                catch (Exception e) {
                    e.printStackTrace();
                }
            });
            t.start();
            return "";
        });

EXAMPLE: Servlet 3.0 API's

Here is the same example implemented with Servlet 3.0 API's.

        get("/traditional", (req, res) -> {
            final AsyncContext ctx = req.raw().startAsync();
            Thread t = new Thread(() -> {
                try {
                    Thread.sleep(500);
                    PrintWriter writer = ctx.getResponse().getWriter();
                    writer.write("lots of data");
                    writer.close();
                    ctx.complete();
                }
                catch (Exception e) {
                    e.printStackTrace();
                }
            });
            t.start();
            return "";
        });

Both of these approaches are supported by this PR.

joatmon commented 6 years ago

I put together a sample async benchmark application based on this PR. The application can be found here: https://github.com/joatmon/spark-async-benchmark

I've included some sample scripts that demonstrate that, under heavy load, using async request processing for long requests improves the throughput of short synchronous requests by 750x.

When the short queries are competing for jetty threads with the long queries, the application is able to process 250 short requests in ten seconds. With the long queries handled by the async thread pool, the applications processes 188,454 short requests in ten seconds.

joatmon commented 6 years ago

In the benchmark above I implemented an even cleaner way of presenting async request handing to the sparkjava application developer. With this approach developers write async request handlers exactly the same way that they do normal request handlers. They just use a different API to register the handler. Here is example code from the benchmark. Note that I am able to use the exact same method to process requests synchronously or asynchronously.

public class DataController {
    DataService service;

    public DataController(DataService service) {
        this.service = service;
        Spark.get("/data", this::getData);
        SparkAsync.getAsync("/asyncdata", this::getData);
    }

    private String getData(Request req, Response resp) throws Exception {
        long delay = 0;
        String delayParam = req.queryParams("delay");
        if (delayParam != null) {
            delay = Long.parseLong(delayParam);
        }
        return service.getData(delay);
    }
}

danforbes commented 6 years ago

@perwendel - can you please let us know your thoughts on this pull request? We believe this is a very useful feature and it has been thoroughly tested by my team as we move our Spring Boot microservice onto the excellent Spark framework. We are happy to work with you to iron out any issues you see.

perwendel commented 6 years ago

@danforbes sorry for not addressing this earlier. We have limited bandwidth. I will review this soon!

perwendel / spark

Add support for asynchronous request handling #983

TL;DR

USE CASE

EXAMPLE: Spark API's

EXAMPLE: Servlet 3.0 API's