jakartaee / mail-api

Jakarta Mail Specification project
https://jakartaee.github.io/mail-api
Other
240 stars 100 forks source link

Multipart performs blocking call in every instantiation #699

Open jonathannaguin opened 8 months ago

jonathannaguin commented 8 months ago

Describe the bug When a Multipart object is created, the field streamProvider is also instantiated which relies ultimately on FactoryFinder#find (https://github.com/jakartaee/mail-api/blob/master/api/src/main/java/jakarta/mail/util/FactoryFinder.java#L35) that performs blocking operations.

To Reproduce Create a new MimeMultipart object, for example

new MimeMultipart("mixed");

on a project with https://github.com/reactor/BlockHound installed.

Expected behavior Either the FactoryFinder needs to be rewritten to not use non-blocking operations or Multipart could use a static reference so only it's blocked at boot up.

jmehrens commented 4 months ago

Related: https://github.com/jakartaee/mail-api/issues/101

jbescos commented 4 months ago

Hi @jonathannaguin, where is the blocking operation thing in FactoryFinder?, I cannot find it.

jmehrens commented 4 months ago

@jbescos I assumed that the blocking was from Class::forName, ServiceLoader::load, etc.

My initial understanding of the fix is to follow advice from ServiceProvider "... Users are recommended to cache the result of this method." I don't think we are doing this everywhere but, I'm not sure if it is intended or not.

jmehrens commented 4 months ago

The super class Multipart has a protected instance field that most likely causing this issue.

jonathannaguin commented 4 months ago

That's right @jmehrens , https://github.com/jakartaee/mail-api/blob/master/api/src/main/java/jakarta/mail/Multipart.java#L69 will instantiate a StreamProvider (https://github.com/jakartaee/mail-api/blob/master/api/src/main/java/jakarta/mail/util/StreamProvider.java#L195) which will use FactoryFinder to find and instantiate given provider.

I was able to workaround some of this by setting some system properties so the Factory finds the classNames from the properties (avoiding factoryFromServiceLoader).

jmehrens commented 4 months ago

I think that field should be deprecated and removed when move to next major release. If we added a protected method instead of a field we could do searching of the provider of parent part which should have a session and or other child parts.

That said, the easy solution is to create a private static final field and use that to assign the protected instance field. I just have not done the research to figure out if we lose anything by not running this code for every instance.

jbescos commented 4 months ago

The reason we depend on the StreamProvider is because we need an implementation of LineOutputStream and LineInputStream. Ironically we allow the implementations to provide its own, so they can build a better and more efficient code, meanwhile we make it less efficient adding this overhead of the StreamProvider.

I remember @lukasj and me were discussing this when we were splitting mail-api and angus-mail. There was a dependency in the API with the implementation, and we had to solve that introducing the StreamProvider. Initially it was added as a static field for performance reasons, but this makes problems with web containers like Weblogic or Tomcat, because once it is set, all the different web apps running there will have to use the same implementation.

Is it really worth this overhead of trying to find a StreamProvider somewhere to obtain those instances of only two interfaces?. In my opinion, it is better idea to have those implementations of LineOutputStream/LineInputStream within the Mail-API.

In case someone is able to create a more efficient LineOutputStream/LineInputStream implementation, we can just include it here.

CC @lukasj

jmehrens commented 4 months ago

That makes sense. I lean towards always getting the streamprovider from the session as that is a good spot to cache the lookup. In this case I think we can:

  1. Walk up the parents of the part to find a session to get provider. Meaning we have to delay lookup as long as possible so lookup is done in writeTo which in most cases should have an attached parent.
  2. If we can't find session, use default session with system properties and null authenticator.
  3. Last resort of cache and look up.

In any case, we would have to consider removing the instance field at some point to gain the benefit of this approach. I think this is a lower priority ticket.

jbescos commented 4 months ago

The problem of using the default session is that its SteamProvider is static. I imagine 2 different wep-apps running in Tomcat, one with angus-mail and the other with other implementation of SteamProvider. This second app could use the angus-mail StreamProvider loaded by the first app, and I think that is undesired.

If it does not find the SteamProvider from its parents, I think it needs to create a new instance of it. Or we can just remove SteamProvider as I said before.