bbottema / simple-java-mail

Simple API, Complex Emails (Jakarta Mail smtp wrapper)
http://www.simplejavamail.org
Apache License 2.0
1.2k stars 261 forks source link

Unable to parse .msg file - Block 20736 not found #537

Closed StuntsKhumza closed 2 weeks ago

StuntsKhumza commented 2 weeks ago

Hi all, hoping to get some assistance on the above issue.

We current experiencing failures when parsing .msg files using this library.

Sample code:

    URL resource = Main.class.getClassLoader().getResource(filePath);

    File inputFile = new File(resource.toURI());

    Email email = EmailConverter.outlookMsgToEmail(inputFile); //error occurs here

Please see stacktrace below:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Block 15616 not found at org.apache.poi.poifs.filesystem.POIFSFileSystem.getBlockAt(POIFSFileSystem.java:474) at org.apache.poi.poifs.filesystem.POIFSFileSystem.readCoreContents(POIFSFileSystem.java:407) at org.apache.poi.poifs.filesystem.POIFSFileSystem.(POIFSFileSystem.java:361) at org.simplejavamail.outlookmessageparser.OutlookMessageParser.parseMsg(OutlookMessageParser.java:138) at org.simplejavamail.outlookmessageparser.OutlookMessageParser.parseMsg(OutlookMessageParser.java:107) at org.simplejavamail.internal.outlooksupport.converter.OutlookEmailConverter.parseOutlookMsg(OutlookEmailConverter.java:188) at org.simplejavamail.internal.outlooksupport.converter.OutlookEmailConverter.outlookMsgToEmailBuilder(OutlookEmailConverter.java:65) at org.simplejavamail.converter.EmailConverter.outlookMsgToEmailBuilder(EmailConverter.java:210) at org.simplejavamail.converter.EmailConverter.outlookMsgToEmailBuilder(EmailConverter.java:197) at org.simplejavamail.converter.EmailConverter.outlookMsgToEmail(EmailConverter.java:176) at org.example.Main.main(Main.java:19) Caused by: java.lang.IndexOutOfBoundsException: Unable to read 512 bytes from 7995904 in stream of length 7995904 at org.apache.poi.poifs.nio.ByteArrayBackedDataSource.read(ByteArrayBackedDataSource.java:48) at org.apache.poi.poifs.filesystem.POIFSFileSystem.getBlockAt(POIFSFileSystem.java:472) ... 10 more

bbottema commented 2 weeks ago

That smells of a corrupt .msg. Would it be possible to share it with me (in private perhaps)?

StuntsKhumza commented 2 weeks ago

Hi @bbottema unfortunately I can't share the .msg file as it is production user data. Is there any other way to investigate this further?

bbottema commented 2 weeks ago

In that case, I have no way to investigate this further. All I can see is that the message in question causes an exception when being read by Apache POI (so not even in this library itself). I could try to see if the library might be able to continue if if caught and ignored the exception, but for analyses on that (such as determining corrupted data, missing data) I would need an reproducible example.