aquasync / ruby-msg

A library for reading and converting Outlook msg and pst files (mapi message stores)
MIT License
96 stars 31 forks source link

Warning when parsing .msg file from Outlook - unknown encoding 72 #17

Open nocoli opened 4 years ago

nocoli commented 4 years ago

Getting an unknown encoding 72 warning when trying to parse any .msg file saved to my computer from Outlook.

Steps to reproduce

  1. Drag any email from your inbox onto somewhere on your local disk (eg Desktop) to save it.
  2. Try the following code and you get the warnings:
    test = Mapi::Msg.open File.expand_path("path_to_file.msg")
    [10:09:05 /home/nic/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/ruby-msg-1.5.2/lib/mapi/msg.rb:243:parse_substg]
    WARN   unknown encoding 72
    [10:09:05 /home/nic/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/ruby-msg-1.5.2/lib/mapi/msg.rb:243:parse_substg]
    WARN   unknown encoding 72
    [10:09:05 /home/nic/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/ruby-msg-1.5.2/lib/mapi/msg.rb:243:parse_substg]
    WARN   unknown encoding 72
    #<Msg message_class="IPM.Note" from=nil to=nil subject="RE: test" recipients=[#<Recipient:"\"Nocoli\" <nocoli@nocoli.com>">, #<Recipient:"\"Nocoli\" <test@test.com>">] attachments=[]>

    Note: I changed the recipient names and addresses in the output above.

Also a side note I'm not sure why the Msg object above has nil in the to and from properties but when I do test.to and test.from the values display correctly?

Any help or info on this would be much appreciated.

Environment

Windows 10 Outlook Version 2004 Monthly Channel

aquasync commented 4 years ago

I haven't seen that unknown encoding before - perhaps something has changed in newer outlooks. Googling it that seems to be used for GUIDs, probably an easy fix but for now it'll just be dropping the affected properties.

As for the inspect string not working - I think that must have been broken for a while (could be 10 years!). The code seems to forward a few messages to the underlying mapi property store (ie what is displayed as from is test.props.from), however that is not a valid mapi property name. What you get when you call test.from actually comes from combining a few properties - sender_name, sender_email_address, and possibly merged with parsing transport_message_headers. Anyway it should be fixed.

nocoli commented 4 years ago

Thanks for your response. Dropping the affected properties should be okay as I only need to use this gem to produce a preview of the file. After having a closer look at the properties and methods most of the file seems to be in tact from what I can tell.

I actually think this might be related to #9 as I get the same error when calling msg.to_mime.to_s and on inspection of msg.to_mime.parts I can see 2 parts both with different content_types eg: text/plain and text/html.

Seeing as though I can interact with each mime parts I think I'll be able to get around my issue. I've been testing this out by using a local file on my computer but my last (hopefully) hurdle is the ability to open the file after downloading it to memory. I can only see a way to read the file from disk. Is this possible? Thanks

aquasync commented 4 years ago

I believe you can use StringIO to pass a buffer to Msg.open. The two separate parts is standard for how multipart/alternative works - it is likely the message in both plain text & html; the #9 encoding error is actually caused by combination of strings in the source code (which are unspecified encoding, often utf-8), with strings from the msg itself. The mime format has its own way of specifying encodings and the correct approach is to use that and treat the aggregated data as ASCII-8BIT (ie just bytes), but I've not got round to it.