bbottema / outlook-message-parser

A Java parser for Outlook messages (.msg files)
76 stars 35 forks source link

#76 the key is guiding the creation of the list of recipients #81

Open sanastasiadis opened 2 months ago

sanastasiadis commented 2 months ago

Instead of checking if the recipient's name exists in the key, just get all the entries of the key (semicolon separated), and search to find them in the general list of recipients

bbottema commented 2 months ago

This breaks the parsing of the "simple reply with CC.msg" test message, tested in HighoverEmailsTest.testToCcBcc(). In this test case, it doesn't find all CC addresses anymore with this change.

sanastasiadis commented 2 months ago

Sorry for this, I made a correction and I executed the tests to confirm that they pass.

bbottema commented 2 months ago

Hi!

Instead of checking if the recipient's name exists in the key, just get all the entries of the key (semicolon separated), and search to find them in the general list of recipients

Can you please tell me which problem this solves? Do you have an Outlook message that doesn't parse properly without this change? Thanks!

sanastasiadis commented 2 months ago

This resolves the issue #76, and as an example the given file in the issue can be used (CC duplicate recipients bug.msg).

There is a "to" recipient with the same name with a "cc" recipient (with different email addresses though).

The current code, after parsing, the list of getToRecipients returns both OutlookRecipient's, and the list of getCcRecipients returns also both OutlookRecipient's. Which is wrong.

The fix returns only one OutlookRecipient for the getToRecipients and only one OutlookRecipient for the getCcRecipients, however it is not guaranteed that it will return the correct one with the correct address (one is returned with the correct address, and the other with the wrong address), thus it is resolved partially.

To highlight the issue, I added one more msg file (test subject duplicated recipients.msg) which contains exactly the same recipient as "to" and as "cc" recipient. The current code returns twice the same OutlookRecipient for both getToRecipients and getCcRecipients. In this case, both the number of the recipients list, the names and the addresses are corrected with the fix.

bbottema commented 2 months ago

Unfortunately, this doesn't completely solve the bug yet. The number of recipients are correct, but the actual recipient addresses are switched up (TO becomes CC and CC becomes TO).

See branch bug_duplicate_recipients for a proper test case, HighoverEmailsTest.testDuplicateRecipientsBug. As you can see, it still fails when I merge your changes into it. Could you have a look please?