jakartaee / mail-api

Jakarta Mail Specification project
https://jakartaee.github.io/mail-api
Other
245 stars 102 forks source link

MimeBodyPart RFC2231 Outlook issue: Allow to override internal static fields encodeFileName in constructor instead of SystemProperties #513

Open chrisrueger opened 3 years ago

chrisrueger commented 3 years ago

Is your feature request related to a problem? Please describe. Some background: The reason behind this issue is our attempt to make it possible to send an Attachment to "Outlook client which has problems displaying filenames with german umlauts (ä, ü, ö etc.) correctly. Search the internet for "outlook rfc2231 rfc2047".

In the past we were using JavaMail 1.4.3 which worked out of the box with outlook. But now we are on 1.6.2 (yes, JavaMail, but we consider switching to JakartaMail. But JakartaMail still has the same problem as JavaMail 1.6.2)

Currently MimeBodyPart uses static variables for certain customisations like this:

private static final boolean encodeFileName =
    PropUtil.getBooleanSystemProperty("mail.mime.encodefilename", false);

The problem:

SystemProperties are global for the whole application. We would like to use this only in some very specific cases where a customer can e.g. pass a flag to decide wether or not to encode the filename etc.

The magic currently happens in #setFileName(part, name) which unfortunately is package private so it hard / impossible to customize it. You can see it in the comments like:

/*
     * Also attempt to set the Content-Type "name" parameter,
     * to satisfy ancient MUAs.  XXX - This is not RFC compliant.
     */

Describe the solution you'd like

Is it possible to make the private static fields such as encodeFileName, setContentTypeFileName actual class members add a new constructor where you can pass the fields?

public class MimeBodyPartCustomizable extends MimeBodyPart{

    private boolean encodeFileName = PropUtil.getBooleanSystemProperty("mail.mime.encodefilename", false);
    private boolean setContentTypeFileName = PropUtil.getBooleanSystemProperty("mail.mime.setcontenttypefilename", true);

    public MimeBodyPartCustomizable(boolean encodeFileName, boolean setContentTypeFileName) {
        this.encodeFileName = encodeFileName;
        this.setContentTypeFileName = setContentTypeFileName;
    }
...

This would allow us to use this constructor if we want to have full control over all parameters. Otherwise we just use the default constructors which work as it is.

e.g. example pseudo code:

// special handling for outlook which cannot handle RFC2231 which leads to broken attachment names
boolean encodeAttachmentsWithRFC2047 = isSpecialHackForOutlookRFC2047Enabled();
MimeBodyPart attachFilePart = encodeAttachmentsWithRFC2047 ? new MimeBodyPart(encodeFileName, setContentTypeFileName) : new MimeBodyPart();

I assume this would also be backwards compatible when this is a new constructor.

Describe alternatives you've considered

An alternative was to create a customized version in our own package jakarta.mail.internet; and override javax.mail.internet.MimeBodyPartSynesty.setFileName(String) Unfortunatelly then we have to copy lots of code from #setFileName(part, name)

We have not used this alternative yet, it was just a POC.

e.g.

package jakarta.mail.internet;

import java.io.UnsupportedEncodingException;

import jakarta.mail.MessagingException;
import jakarta.mail.Part;

import com.sun.mail.util.MimeUtil;

public class MimeBodyPartCustom extends MimeBodyPart{

    private boolean encodeFileName;
    private boolean setContentTypeFileName;

    public MimeBodyPartCustom(boolean encodeFileName, boolean setContentTypeFileName) {
        this.encodeFileName = encodeFileName;
        this.setContentTypeFileName = setContentTypeFileName;
    }

    public void setFileName(String name) throws javax.mail.MessagingException {

            // all code from https://github.com/eclipse-ee4j/mail/blob/db4f348b8de82670c90d921f26b66ccad6610673/mail/src/main/java/jakarta/mail/internet/MimeBodyPart.java#L1294  which uses the new class members from the constructor
        }
} 

Additional context Add any other context or screenshots about the feature request here.

jmehrens commented 3 years ago

I think there is a more generalized solution that should be explored when dealing with system properties vs session properties. One of the reasons system properties are using is that it can be tricky to gain access to the session.

However, for this case JakartaMail has an interface MessageAware that is designed to gain access to the session. For instance, MimeMultipart use this to set the parent.

I think a similar approach should be applied to the MimeBodyPart by adding a single new constructor public MimeBodyPart(DataSource ds) throws MessagingException that reads the stream from the datasource and inpects if it is a MessageAware datasource. Ideally, it would be great if we could set the parent of the BodyPart and this point as it allows getting the session from the parent. However, it may not be possible to determine the correct parent at this point and it may not matter either as all we need is a path to the root to gain access to the session and addPart will reset the parent anyway. Alternatively, then all we need is a refence to the MessageAware object.

Next step would be to then add a private void initializeProperties() to init the properties from the session assigned to the parent by new MessageContext(this).getSession() and or the session from the datasource ds.getMessageContext().getSession(). If neither is present fallback to system properties. Another alternative would be to lookup the properties

For testing we would then have to check that in the JakartaMail itself if we need to move from MimeBodyPart(InputStream) MimeBodyPart(DataSource) to ensure the right settings are applied for other uses.

There are more details to workout but this outlines the general approach.

chrisrueger commented 3 years ago

Thanks for the detailed suggestion.

to gain access to the session

Let me try to repeat in my words to see if I got that correctly:

So is the idea to put the configuration like mail.mime.encodefilename and mail.mime.setcontenttypefilename into Properties of the Session instead of system properties?

as in: https://github.com/eclipse-ee4j/mail/blob/a9daed6d62a139fb89a10c25632c3ed306daea38/mail/src/main/java/jakarta/mail/Session.java#L297

And then let MimeBodyPart somehow find out if the property is set inside the Session's properties (or the other thing you mentioned) .... and if not fallback to system properties. So the missing piece at the moment is that MimeBodyPart currently has no way to get access to the session. But passing a MessageAware DataSource (like FileDataSource in my case) would allow this.

Question:

Why do you prefer putting MimeBodyPart-specific options in session properties - as opposed to set the directly on MimeBodyPart? Is this for consistency, because the system properties already exist? I mean the scope is smaller than system properties, but still larger. I am not doubting your approach, just trying to understand.

jmehrens commented 3 years ago

You have captured the idea of my intent. I'm sure once it is coded up, the flaws of my hand waving will come to light :).

The topic of session vs. system properties as come up before in the past. Unfortunately, I didn't draft up this approach and explain to @bshannon but were are his thoughts on a related topic:

In a mutli-threaded application, querying the property dynamically doesn't really help.

In most cases these static properties were not expected to change over the lifetime of an application.

In some cases, the use of static was an unfortunate compromise because the code that needed it didn't have access to the Session. Changing all the APIs to pass the Session through explicitly would've made the code more complex.

Thread local storage was considered, but that depends too heavily on the threading model used by the application.

Wherever possible, Session properties were used, but in some cases there was no good way to do that so System properties were used.

When I read that I see two trade-offs at play when the original code was developed:

  1. Session properties are preferred over system properties.
  2. Passing the session through explicitly makes the code more complex.

My thought is that going with "MessageAware DataSource" is actually inline with what has already been done in the past. E.G. MimeMultipart(DataSource). If needed my approach could work without adding new methods as getParent and setDataHandler already exist on the MimeBodyPart. My initial thoughts are that my solution would satisfy both trade offs above and make the behavior more like what has been done before already in this codebase.

Adding the setters has merit but, not sure we need to be able to change the properties at that scope. The setter approach also means that we have to split methods in to classes in the Jakarta namespace and the com.sun namespace based on if they are spec or reference implementation.

jbescos commented 3 years ago

I have done another commit to get the properties from session.

I found that we cannot get the InputStream from the DataSource because, although originally it comes from the DataSource, it changes during the MimeMultipart#parse. For this reason I created a new constructor having MimeBodyPart(Session session, InputStream is) similarly as the constructor that MimeMessage has.

jmehrens commented 3 years ago

@jbescos Reviewing this is still on my todo list. I'll get you a proper review once I can give this a test drive.