eclipse-ee4j / angus-mail

Angus Mail
https://eclipse-ee4j.github.io/angus-mail
Other
49 stars 16 forks source link

Support for virtual threads #128

Open jurgis-sipols opened 5 months ago

jurgis-sipols commented 5 months ago

Is your feature request related to a problem? Please describe. Since Java LTS version 21 Virtual Threads are becoming widely accepted and tests show huge benefits they give both by gaining performance and by not using hard to read/maintain reactive libraries. Are there plans to support Virtual Threads? If so, is there ETA available?

Describe the solution you'd like Support for virtual threads. There is a lot of synchronised logic in the codebase using synchronised keyword which has alternatives, e.g., ReentrantLock. Synchronised block makes Virtual Thread pinned to the carrier (platform/OS) thread if blocking activities are being done and they are -> I/O operations. Also, e.g., when I add jakarta.mail.Folder#addMessageCountListener and my listener is triggered, this takes place on a platform (OS) thread.

Describe alternatives you've considered There are no clear alternatives. Library currently doesn't give such options as far as I know.

Additional context https://docs.oracle.com/en/java/javase/21/core/virtual-threads.html#GUID-704A716D-0662-4BC7-8C7F-66EE74B1EDAD

A virtual thread cannot be unmounted during blocking operations when it is pinned to its carrier. A virtual thread is pinned in the following situations:

jbescos commented 5 months ago

I think there is some work in JDK 21 to not pin platform threads with synchronized, but we can add ReentrantLock for earlier JDK 21 versions.

jurgis-sipols commented 5 months ago

And what about internally created threads? E.g. I can't control thread executor or thread factory of the lib, and thread executing jakarta.mail.event.MessageCountListener#messagesAdded is always platform/OS thread.

jurgis-sipols commented 5 months ago

I did a little digging and jakarta.mail.EventQueue#enqueue allows to set executor in case of scopes store or folder, see jakarta.mail.Service#Service :

Executor executor = (Executor) session.getProperties().get("mail.event.executor");

I could pass Store propeties with this entry:

properties.put("mail.event.executor", Executors.newVirtualThreadPerTaskExecutor());

and it looks like I have incoming virtual thread. Is such approach ok?

jbescos commented 5 months ago

I did a little digging and jakarta.mail.EventQueue#enqueue allows to set executor in case of scopes store or folder, see jakarta.mail.Service#Service :

Executor executor = (Executor) session.getProperties().get("mail.event.executor");

I could pass Store propeties with this entry:

properties.put("mail.event.executor", Executors.newVirtualThreadPerTaskExecutor());

and it looks like I have incoming virtual thread. Is such approach ok?

Yes, that should be correct. Nobody knew during that time that virtual threads were going to exist, but fortunately it was designed to run in web containers like Weblogic, Tomcat, etc, where they are managing their own thread pools. So, it seems it is possible to pass that to the angus-mail.

jmehrens commented 5 months ago

Synchronised block makes Virtual Thread pinned to the carrier (platform/OS) thread if blocking activities are being done and they are -> I/O operations. Also, e.g., when I add jakarta.mail.Folder#addMessageCountListener and my listener is triggered, this takes place on a platform (OS) thread.

It is an old code base with lots of synchronization, classes that can/must be extended, 3rd party listeners that can be notified, and API split from the implementation. Using the synchronized keyword between parent and child classes and between API and implementation is easy. Sharing secret lock between API and implementation doesn't really work. For compatibility sake we must check if this class is subclass and resort back to synchronization because an existing subclass is not aware of the internal ReentrantLock. Those checks are going to bloat the code only to resort back to the existing behavior. Exposing such a lock protected or private not desired either.

For instance, take a look at jakarta.mail.Service which is a grandparent of the Store/Transport extensions in Angus Mail. That is an example of parent->child split, API->implementation split, and notification of listeners. Getting rid of the locking at the parent/grandparent level would help here gets rid of the API->implementation boundary crossing. Leaving all the locking in the implementation which enables some tricks for sharing.

Recommend that when this ticket is attempted, for classes that can be extended we should first look at safely dropping synchronized keyword, followed by local private use of volatile/atomics, before falling back to using locks. For "mail.event.executor" we should consider a new mail.event.executor.class or mail.event.executor.factory key that takes a string and can load Executors.newVirtualThreadPerTaskExecutor(). Perhaps even fix the default thread creation.

I can take on the changes for the logging package. I've been avoiding them for the logging package because I think of that code as low through put anyway. I suppose there is reason to do this to reduce overall system resource usage.

jbescos commented 5 months ago

Synchronised block makes Virtual Thread pinned to the carrier (platform/OS) thread if blocking activities are being done and they are -> I/O operations. Also, e.g., when I add jakarta.mail.Folder#addMessageCountListener and my listener is triggered, this takes place on a platform (OS) thread.

It is an old code base with lots of synchronization, classes that can/must be extended, 3rd party listeners that can be notified, and API split from the implementation. Using the synchronized keyword between parent and child classes and between API and implementation is easy. Sharing secret lock between API and implementation doesn't really work. For compatibility sake we must check if this class is subclass and resort back to synchronization because an existing subclass is not aware of the internal ReentrantLock. Those checks are going to bloat the code only to resort back to the existing behavior. Exposing such a lock protected or private not desired either.

For instance, take a look at jakarta.mail.Service which is a grandparent of the Store/Transport extensions in Angus Mail. That is an example of parent->child split, API->implementation split, and notification of listeners. Getting rid of the locking at the parent/grandparent level would help here gets rid of the API->implementation boundary crossing. Leaving all the locking in the implementation which enables some tricks for sharing.

Recommend that when this ticket is attempted, for classes that can be extended we should first look at safely dropping synchronized keyword, followed by local private use of volatile/atomics, before falling back to using locks. For "mail.event.executor" we should consider a new mail.event.executor.class or mail.event.executor.factory key that takes a string and can load Executors.newVirtualThreadPerTaskExecutor(). Perhaps even fix the default thread creation.

I can take on the changes for the logging package. I've been avoiding them for the logging package because I think of that code as low through put anyway. I suppose there is reason to do this to reduce overall system resource usage.

I started with this yesterday and I noticed it is going to be a big effort, and also it is going to break other Mail implementations. For instance the class of your example: jakarta.mail.Service. If we replace the synchronized by a protected reentrant lock, subclasses of other implementations will need to also remove the synchronized and use the reentrant lock.

safely dropping synchronized keyword is not always possible. At some point we are going to need to share the reentrant lock with other classes/subclasses. If we are fine with this, I can continue working on it.

We will need to create a new major version for Mail API and Angus Mail because this is going to break API compatibility. I already did a similar work for HK2, but it was easier there because there are no other implementations that requires to hold a lock: https://github.com/eclipse-ee4j/glassfish-hk2/pull/963

I am not sure it is worth the effort because as I pointed out here there are plans to not pin threads in synchronized blocks. We can wait to see what happens.

jmehrens commented 5 months ago

I am not sure it is worth the effort because as I pointed out https://github.com/eclipse-ee4j/angus-mail/issues/128#issuecomment-1945759686 there are plans to not pin threads in synchronized blocks. We can wait to see what happens.

That is great news! Lets see what comes of that.

safely dropping synchronized keyword is not always possible. At some point we are going to need to share the reentrant lock with other classes/subclasses. If we are fine with this, I can continue working on it.

It brings me pause to think about exposing a lock. Shared secret in and or between mail and angus would be okay. However, having 3rd parties see such a lock or have to modify their code to opt-in to use that lock is problematic.

FYI I'm reminded that Bill covered Threadsafety in Javamail on SO.