pst-format / libpst

library for reading Microsoft Outlook PST files
GNU General Public License v2.0
16 stars 4 forks source link

Microsoft documentation vs msg.cpp #8

Open inzanez opened 1 year ago

inzanez commented 1 year ago

Hi

I was wondering how the difference between the values used in "msg.cpp" for "msg" properties differ from the official Microsoft documentation found here.

For instance, PidTagBodyHtml should be 0x1013001F but in msg.cpp it is 0x1013001E. Actually pretty much all values ending on E should be ending on F according to the documentation from Microsoft.

Any idea how that happened?

pabs3 commented 1 year ago

The author of that code isn't involved in libpst any more and the code doesn't give any explanation of what happened. You could try to ask them via email but I think they likely won't respond quickly or at all.

If you know of any sample PST files where PidTagBodyHtml is used, then we should be able to verify the issue and fix the code or spec.

I note that another open source PST library has the same issue, so I think that this is either a bug in the standard or the value changed.

https://github.com/libyal/libfmapi/blob/main/libfmapi/libfmapi_property_type.c#L298

For our future reference, here are the docs on PidTagBodyHtml:

https://docs.microsoft.com/en-us/openspecs/exchange_server_protocols/ms-oxprops/8ba4bc78-ab04-49c3-8ebb-e7325fa74ace

-- bye, pabs

https://bonedaddy.net/pabs3/

inzanez commented 1 year ago

@pabs3 Ok, it seems that there's two versions of msg containers:

pabs3 commented 1 year ago

I see. Have you seen the UTF-16/F one in the wild? If you want to implement support for it, any patches would be welcome.

-- bye, pabs

https://bonedaddy.net/pabs3/

inzanez commented 1 year ago

@pabs3 Well, for PST files themselves: pretty much all new PST files are Unicode and not ASCII. And I think it would be the same for all MSG files that are exported by modern software with sane defaults.

I am afraid that I currently don't have the time to build that support...maybe in the future,...:-)

pabs3 commented 1 year ago

OK. If you could attach a sample PST file that illustrates the issue, then that would help who-ever ends up working on this issue. You may need to enclose the file within a ZIP file to bypass GitHub filters.

-- bye, pabs

https://bonedaddy.net/pabs3/

inzanez commented 1 year ago

@pabs3 I will do so once I find one I can post easily. Otherwise I will try to make one myself (using a Microsoft Product). Actually, all recent "pst" files created by Outlook should be UTF.

pabs3 commented 1 year ago

@inzanez did you manage to find/create a PST file with UTF-16 PidTagBodyHtml that you can share?

inzanez commented 1 year ago

@pabs3 Oh yes, thanks for the reminder. I will get back to you soon.

pabs3 commented 1 year ago

@inzanez did you manage to find/create a PST file with UTF-16 PidTagBodyHtml that you can share?

pabs3 commented 1 year ago

@inzanez btw I refactored the code to split 0x1013001F up into two values, because it is clear from the docs that 0x001F is the type and 0x1013 is the id.

https://learn.microsoft.com/en-us/office/client-developer/outlook/mapi/pidtagbodyhtml-canonical-property https://learn.microsoft.com/en-us/office/client-developer/outlook/mapi/property-types

pabs3 commented 1 year ago

@inzanez btw did you see some issue caused by the 0x1013001F vs 0x1013001E thing, or was it just a curiosity after reading the specs?