javaee / ejb-spec

See javax.ejb project for API. Contains legacy issues only.
https://github.com/javaee/javax.ejb
6 stars 1 forks source link

Introduction of @MaxConcurrency annotation #9

Open glassfishrobot opened 13 years ago

glassfishrobot commented 13 years ago

Currently proprietary max-pool-size settings are used to control the max concurrency of an application and so to prevent overloading the application server.

We could abstract the intended behavior into an annotation:

@Target(value =

{ElementType.METHOD, ElementType.TYPE}) @Retention(value = RetentionPolicy.RUNTIME) public @interface MaxConcurrency { public int value() default 30; }

We could provide more generic annotation like:

@Target(value = {ElementType.METHOD, ElementType.TYPE}

) @Retention(value = RetentionPolicy.RUNTIME) public @interface Concurrency { public int max() default 30; }

to support future extensions.

This functionality should be available for EJB 3.2 and CDI.

Affected Versions

[3.2]

glassfishrobot commented 13 years ago

Reported by abien

glassfishrobot commented 12 years ago

abien said: Proposal:

@MaxConcurrency annotation as well as the element in the ejb-jar.xml specify the maximum number of requests handled by a given bean class. After the configured number of concurrent requests is reached, the container throws a javax.ejb.MaxConcurrencyExceededException. A @MaxConcurrency value of -1 indicates an infinite amount of concurrency. Max Concurrency applies on classes, to specify per bean concurrency, as well as @Asynchronous methods. Max Concurrency is neither applicable on synchronous methods [I guess without this restriction, the feature would be harder to implement by the container provider], nor on @Singleton beans.

public class MaxConcurrencyExceededException extends EJBException{

public MaxConcurrencyExceededException(String message, Exception ex)

{ super(message, ex); }

public MaxConcurrencyExceededException(Exception ex)

{ super(ex); }

public MaxConcurrencyExceededException(String message)

{ super(message); }

public MaxConcurrencyExceededException() { }

}

@Documented @Target(

{ElementType.METHOD, ElementType.TYPE}

) @Retention(RetentionPolicy.RUNTIME) public @interface MaxConcurrency

{ /* Specifies the maximum concurrency of an EJB or asynchronous method
-1 is the default. A negative value means infinite number of requests. / int value() default -1; }

Default: -1 infinite.

glassfishrobot commented 12 years ago

manuel_b said: We are heavily using the asynchronous annotation for long running processes e.g. sending out thousands of emails. During sending out these emails the server is very busy and can become unresponsive for other more important things like answering HTTP requests. It would be great if a thread priority could be set somewhere e.g. adding it to the Asynchronous annotation to tell the Java VM to only run the asynchronous jobs if there is nothing better to do.

@Asynchronous(ThreadPriority=Thread.MAX_PRIORITY) @Asynchronous(ThreadPriority=Thread.NORM_PRIORITY) @Asynchronous(ThreadPriority=Thread.MIN_PRIORITY)

Currently even a proprietary solution in glassfish by changing the thread pool priority is not possible because these values are hard coded:

http://java.net/projects/glassfish/sources/svn/content/tags/3.1-b43/ejb/ejb-container/src/main/java/com/sun/ejb/containers/EjbAsyncInvocationManager.java?rev=46843 Line: 394

glassfishrobot commented 12 years ago

mvatkina said: The GlassFish impl was changed to use the default thread pool.

I understand a low-priority task (though we might need to add a timeout not to wait forever), but several high-priority tasks can be dangerous as they can block all other processes in the app server.

glassfishrobot commented 12 years ago

@arjantijms said:

as they can block all other processes in the app server.

Slightly related to blocking tasks; what about having the ability to specify different thread pools should be used for beans with a @MaxConcurrency annotation?

The spec is currently silent about any thread-pools being used. This leads to potential dead-locks with some algorithms, e.g. if from within an @Asynchronous call another @Asynchronous method is called and this first bean waits on the Future to return results, it will block a thread pool thread in many implementations. If this happens a few times, all threads are waiting for results and the 'workers' can not make any progress.

If the waiting threads are in a different thread-pool than the ones they are waiting for, this particular kind of dead-lock cannot happen.

glassfishrobot commented 12 years ago

manuel_b said: I agree that MAX_PRIORITY is problematic so the better way would be to only allow ThreadPriority.BELOW_NORMAL which would downgrade the asynchronous thread.

In our current case this would be completely sufficient. The last time I used the asynchronous function with a bigger chunk it ran for about 12 hours (generating about thousands of personalized emails) so a timeout is a good idea but should be disabled by default.

We are currently running batch and "real" time work load e.g. http requests on the same machines so both can leverage from the local cache.

glassfishrobot commented 12 years ago

mvatkina said:

bean waits on the Future to return results, it will block a thread pool thread

The whole purpose of asynchronous invocations is not to block the the thread. So I would say that your scenario is a user error and needs to be fixed in their code...

glassfishrobot commented 12 years ago

@arjantijms said:

The whole purpose of asynchronous invocations is not to block the the thread. So I would say that your scenario is a user error and needs to be fixed in their code...

Well, imagine a background task kicked off by a call to an @Asynchronous method. This task might participate in a parallel bounded producer/consumer setup that's implemented via @Asynchronous as well.

In such scenario, the task thread doesn't immediately block after every asynchronous invocation. That would indeed defeat the purpose of it and would be a user error. Instead, such a task would for example be kicking off several producers using asynchronous dispatches and feeding the produced data into several consumers using asynchronous dispatched as well. Those consumers might be slower than the producers, so in order to maintain a boundary on the amount of outstanding work there more or less has to be* a wait and thus a block somewhere.

There aren't any real other concurrency primitives in Java EE or EJB, and the ones that are there (Timers, JMS) would not only feel hackish, but actually don't have any guarantees that the implementation isn't scheduling them on the same thread pool that's used for @Asynchronous. There too, the spec is simply silent about this.

So, I'm not completely sure how this should be fixed in code and what the user error exactly is, unless you're saying that we shouldn't be doing any real parallel work in Java EE at all.

In practice a solution that works is to create a Java SE ExecutorService that runs the task, or use something like Quartz that has its own thread-pool as well. Code in such a task can bootstrap itself by obtaining an EJB from JNDI, from which the rest of the process can then run. This is hackish too, but something that does work on every container that I know of.

*Yes, it's possible to avoid the block and instead of waiting schedule a polling timer with some kind of context object that let's us resume where we left of, but for any non-trivial algorithm this is highly complex and rather error prone.

glassfishrobot commented 12 years ago

@arjantijms said: p.s. an alternative solution to the problem outlined above would be to have a join() operation, where the thread that called @Asynchronous methods can wait until one or more of those are done.

Instead of being implemented by a block or wait of any kind, the join() operation turns the thread into a worker thread for asynchronous work, i.e. the thread tries to pick up tasks from the container's internal task queue with the call to join() still on its call stack. This is basically how Java SE 7's fork/join threads work as well and prevents the dead-lock without needing a second thread-pool.

I guess though that such a feature is more in scope for JSR 236.

glassfishrobot commented 12 years ago

manuel_b said:

p.s. an alternative solution to the problem outlined above would be to have a join() operation, where the thread that called @Asynchronous methods can wait until one or more of those are done.

Hi Arjan, you have already the possibility to wail for an asynchronous method. Just return a Future.

So waiting for a pool of task would look similar to this:

Collection<Future<Void>> taskStates = new ArrayList<Future<Void>>();
List<Task> list = getBunchOfTask();
for(Task task : taskStates) {
    taskStates.add(processTaskAsynchronously(task));
}
for(Future<Void> done : taskStates) {
    try {
        done.get();
    } catch (InterruptedException e) {
        log.log(Level.WARNING, "Could not wait for task", e);
    } catch (ExecutionException e) {
        log.log(Level.WARNING, "Could not wait for task", e);
    }
}
glassfishrobot commented 12 years ago

dblevins said: Hi All! Been following this excellent thread – very interesting.

Let me recap the goals (someone jump in and clarify if I missed a point):

These are good goals. We may need to find alternate ways to achieve them than what has been proposed.

The tt>@MaxConcurrency</tt concept is compelling and I had proposed something similar on Adam's blog years ago. When it comes to tt>@Asynchronous</tt calls there is a fundamental problem allowing the pool size to be specified on a per method basis. If there are 100 tt>@Asynchronous</tt methods and each sets its pool size, that essentially requires 100 ThreadPoolExecutor instances. This could be thousands of threads. I don't think a thread pool per method is the way to go.

We're currently struggling with this in OpenEJB as we have one ThreadPoolExecutor per application, but really even that's not great. Ultimately, the number of cores is the largest indicator on what the thread pool size should be. The question of where and how you set that has a direct baring on how many pools you really want (as few as possible).

In terms of what the behavior is when an tt>@Asynchronous</tt is full, that is currently unspecified. I think OpenEJB will throw an EJBException after making your thread wait 30 seconds for an opening. That duration is configuration as is the actual Queue implementation – some queues have a size, some queues like SynchronousQueue are entirely wait based and leverage only the thread pool size.

As I say, very interesting topic! I'm not exactly sure what the right answers are, but I'd love to hear thoughts with the above "guts" in mind. Always good to think "how would I implement this?"

glassfishrobot commented 12 years ago

@arjantijms said:

you have already the possibility to wail for an asynchronous method. Just return a Future.

That's correct, and this is the recommended way to wait for such methods from say an HTTP request processing thread.

However, when doing this kind of wait from within a thread that is itself one of the threads from the pool that is handling asynchronous methods, you'll run the risk of getting into a dead lock situation.

To illustrate this, assume we have a thread pool with two threads, t1 and t2. We have a bean B1 that within an asynchronous method M1 calls asynchronous method M2 on bean B2 three times and waits for it. We also have a task queue Q that holds the not yet executed tasks.

After an initial call to M1, the situation may look as follows:

thread    execution stack
t1           M1 (blocked, waiting)
t2           M2 (executing)

queue     content
Q            M2, M2

Before the other two calls to M2 are done, another (asynchronous) call to M1 comes in, making the situation as follows:

thread    execution stack
t1           M1 (blocked, waiting)
t2           M2 (executing)

queue     content
Q            M2, M2, M1

When M2 is done executing, the system selects a new task from the queue. Suppose this happens to be M1. The situation is now as follows:

thread    execution stack
t1           M1 (blocked, waiting)
t2           M1 (blocked, waiting)

queue     content
Q            M2, M2, M2, M2, M2

The system is now in a dead lock.

Both threads are waiting for an M2 to get done, but because they are waiting they are occupying a thread from the pool and thus no M2 can ever be executed.

Strict queue ordering would not really solve this problem. As the queue could be empty when M1 started to run initially and then just before it calls M2 asynchronously the second call to M1 could happen, which would again dead lock the system.

The example is not really contrived, as I have unfortunately encountered this problem a couple of times in the wild when investigating a Java EE application that 'mysteriously' froze.

A join() like in fork/join prevents the dead lock by utilizing the wait to run other tasks in the same thread. The last example would then look like the following:

thread    execution stack
t1          M2
            join
            M1
t2          M2
            join
            M1

queue     content
Q            M2, M2, M2

A second thread pool would also solve the problem, since no matter how full the queue and how slow things get an M1 would not block an M2 from making progress.

Suppose we have two thread pools now, with each only 1 thread, t1 now belonging to pool 1, and t2 belonging to pool 2. The last example would then look like this:

thread    execution stack
t1           M1 (blocked, waiting)
t2           M2

queue     content
Q1          M1
Q2          M2, M2
glassfishrobot commented 11 years ago

darious3 said: Any progress here?

glassfishrobot commented 11 years ago

mvatkina said: Unfortunately not.

To move forward, somebody needs to write a very clear proposal that addresses all concerns expressed in this issue.

glassfishrobot commented 11 years ago

darious3 said: If I look at the calendar then I see it's already November. If Java EE 7 is to be released around April 2013, then I guess there really isn't a lot of time left to do anything, is there?

As an external observer of the EJB spec, I have to say that it looks like not much is happening and a lot of issues fail to move forward because of a lack of response. I might be totally wrong, and maybe the EG is doing a lot of work that I simply missed. But to a community member like me it just looks like very few people are actually involved or interested on really working on the spec. If I compare this to say JSF or CDI, then some issues there don't make it because the people working on those specs just have a lot of other issues to work on, but with EJB is looks like there's just almost nobody around.

Again, maybe I'm totally wrong here and not seeing the whole picture.

glassfishrobot commented 12 years ago

File: MaxConcurrency.zip Attached By: abien

glassfishrobot commented 13 years ago

Was assigned to mvatkina

glassfishrobot commented 7 years ago

This issue was imported from java.net JIRA EJB_SPEC-9