Sicos1977 / MSGReader

C# Outlook MSG file reader without the need for Outlook
http://sicos1977.github.io/MSGReader
MIT License
474 stars 169 forks source link

Need TimeZoneInfo on message #407

Closed Xenophage666 closed 3 months ago

Xenophage666 commented 3 months ago

Describe the bug Hi there! I have a request for this project, I have a few ideas to handle it but I'll leave the implementation to you. We are using this MSGReader library in an AWS Lambda project. We are noticing that when we open an email using this library locally, the SentOn field is correct. But when this runs in an AWS cloud environment, the SentOn is UTC time. Is there any way we can make sure SentOn is always the original time sent from the email? Or, to prevent modifying existing functionality, can you add another field that contains maybe TimeZoneInfo, or a DateTime object that contains the time zone within? I checked the email we are testing with, it does contain time zone info right next to the time being used to populate SentOn.

Our only alternative right now is I have to append "UTC" after the final date time info being rendered, which is not ideal. I can't look at local time to convert because we push this through a msg to PDF converter within the Lambda. So the final output is a read-only PDF.

Let me know what you think, or if you have any other questions. Appreciate the help on this!

Thanks! -Trevor

To Reproduce

  1. Use this MSGReader to open an email with header data
  2. Look at the SentOn field,

Expected behavior SentOn field does not get converted to UTC, but retains time zone info Or New field with time zone data Or New field with original date time with time zone data

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information): Visual Studio 2022 AWS Lambda

Smartphone (please complete the following information): none

Additional context none

Sicos1977 commented 3 months ago

As far as I know, the sent info is always stored as UTC time so that you don't have any problems reading the property in another timezone... so I guess that AWS just uses UTC time and that you have to take into account into what timezone you are reading it.

Sicos1977 commented 3 months ago

Can you sent me your msg file so that I can have a look at it? Please ZIP the file before sending it to sicos2002@hotmail.com

Xenophage666 commented 3 months ago

I can't send the entire email in question because it is sensitive. But I am including the message source data, with some fields removed. but the relevant fields are still there. You can see some of the Received fields contain "Mon, 10 Apr 2023 10:07:23 -0700", with time zone included. Ideally, the time portion of the SentOn field should show "... 10:07 AM PST". Meanwhile I'll try to generate a test email that has this same issue.

Thanks for your help on this!

Received: from SA1PR08MB7151.namprd08.prod.outlook.com (2603:10b6:806:18a::12) by BYAPR08MB4903.namprd08.prod.outlook.com with HTTPS; Mon, 10 Apr 2023 17:09:49 +0000 Received: from BN9PR03CA0787.namprd03.prod.outlook.com (2603:10b6:408:13f::12) by SA1PR08MB7151.namprd08.prod.outlook.com (2603:10b6:806:18a::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.38; Mon, 10 Apr 2023 17:09:35 +0000 Received: from BN8NAM12FT022.eop-nam12.prod.protection.outlook.com (2603:10b6:408:13f:cafe::cc) by BN9PR03CA0787.outlook.office365.com (2603:10b6:408:13f::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.38 via Frontend Transport; Mon, 10 Apr 2023 17:09:35 +0000 Authentication-Results: spf=pass (sender IP is 209.85.167.45) smtp.mailfrom=gmail.com; dkim=pass (signature was verified) header.d=gmail.com;dmarc=pass action=none header.from=gmail.com;compauth=pass reason=100 Received-SPF: Pass (protection.outlook.com: domain of gmail.com designates 209.85.167.45 as permitted sender) receiver=protection.outlook.com; client-ip=209.85.167.45; helo=mail-lf1-f45.google.com; pr=C Received: from mx0b-00128103.pphosted.com (205.220.172.180) by BN8NAM12FT022.mail.protection.outlook.com (10.13.183.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6298.27 via Frontend Transport; Mon, 10 Apr 2023 17:09:34 +0000 Received: from pps.filterd (m0352049.ppops.net [127.0.0.1]) by mx0b-00128103.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33AEnBTv004960 for xxx@firstam.com; Mon, 10 Apr 2023 10:09:34 -0700 Authentication-Results-Original: ppops.net; spf=pass smtp.mailfrom=xxx@gmail.com; dkim=pass header.s=20210112 header.d=gmail.com; dmarc=pass header.from=gmail.com Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-00128103.pphosted.com (PPS) with ESMTPS id 3pvm84gt73-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for xxx@firstam.com; Mon, 10 Apr 2023 10:09:34 -0700 Received: from pps.reinject (m0352049.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33AH3Xb7021183 for xxx@firstam.com; Mon, 10 Apr 2023 10:09:33 -0700 Received: from mail-lf1-f45.google.com (mail-lf1-f45.google.com [209.85.167.45]) by mx0b-00128103.pphosted.com (PPS) with ESMTPS id 3pvm84gsvr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for xxx@firstam.com; Mon, 10 Apr 2023 10:07:23 -0700 Received: by mail-lf1-f45.google.com with SMTP id t20so7150543lfd.5 for xxx@firstam.com; Mon, 10 Apr 2023 10:07:22 -0700 (PDT) X-Google-Smtp-Source: AKy350ah2NJ+OIQyZ3lavCiug0zzjo/iU1C2q9xLFjxi01eIwIhCZr7Vsww3NZBhNSaCDG/lhk6cwMU5rjdh7SL/Pao= X-Received: by 2002:a19:ac06:0:b0:4eb:1606:48d5 with SMTP id g6-20020a19ac06000000b004eb160648d5mr2171535lfc.7.1681146440717; Mon, 10 Apr 2023 10:07:20 -0700 (PDT) MIME-Version: 1.0 From: Bill Hurd xxx@gmail.com Date: Mon, 10 Apr 2023 11:07:08 -0600 Message-ID: CAKmEThsXVO4sXOD-uBP8Jjp3vKUj-sz+NCqvo1G7KRrP2s5xZg@mail.gmail.com Subject: [External] Buyers Cancellation - 380 Hunters Ridge Cir To: Jose Ornelas xxx@firstam.com, Jen Bohn xxx@gmail.com, Jeff Bohn xxx@gmail.com Content-Type: multipart/mixed; boundary="0000000000006a353705f8fe687b"

X-Proofpoint-GUID: CPYUuba9QeqCWZEJDhEbBuDdAtcZvJwn X-Proofpoint-ORIG-GUID: jYIcEjO_9PBsBzdn5MVYP5YUGxq_DtcP X-CLX-Shades: Deliver X-External-Message: external X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-10_12,2023-04-06_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=inbound_noblock_bulk_notspam policy=inbound_noblock_bulk score=0 mlxlogscore=999 phishscore=0 malwarescore=0 clxscore=1005 bulkscore=0 suspectscore=0 mlxscore=0 priorityscore=338 spamscore=0 lowpriorityscore=0 adultscore=0 impostorscore=0 classifier=clx:Deliver adjust=0 reason=mlx scancount=2 engine=8.12.0-2303200000 definitions=main-2304100147 domainage_hfrom=10102 Return-Path: xxx@gmail.com X-MS-Exchange-Organization-ExpirationStartTime: 10 Apr 2023 17:09:35.1208 (UTC) X-MS-Exchange-Organization-ExpirationStartTimeReason: OriginalSubmit X-MS-Exchange-Organization-ExpirationInterval: 1:00:00:00.0000000 X-MS-Exchange-Organization-ExpirationIntervalReason: OriginalSubmit X-MS-Exchange-Organization-Network-Message-Id: fa7f34a5-c3df-470c-a0f5-08db39e65bf6 X-EOPAttributedMessage: 0 X-EOPTenantAttributedMessage: 4cc65fd6-9c76-4871-a542-eb12a5a7800c:0 X-MS-Exchange-Organization-MessageDirectionality: Incoming X-MS-Exchange-SkipListedInternetSender: ip=[209.85.167.45];domain=mail-lf1-f45.google.com X-MS-Exchange-ExternalOriginalInternetSender: ip=[209.85.167.45];domain=mail-lf1-f45.google.com X-MS-PublicTrafficType: Email

Xenophage666 commented 3 months ago

Hey, I also emailed you a spam message I got containing the time zone data, it would have come from my work email, traridon [a] firstam.com.

Thanks!

Sicos1977 commented 3 months ago

I added this code to try to detect the timezone --> 517b495af53e60827027c237e5ec01e4c91ef56d

Get the code from GitHub and try it out and let me know if it does what you need. One thing I'm not sure of is if this code also takes day light saving time into account.

Xenophage666 commented 3 months ago

Hey Sicos1977,

I reviewed the code, this looks like it will work! Thanks for the quick turnaround on this. I will need some time to test it out. Thanks!

Xenophage666 commented 3 months ago

So unfortunately this builds on an older Dot Net version that I cannot install on my work machine, so I cannot compile it locally. From what I can tell in the code though, this should do what we need it to. Can you go ahead and update the Nuget? Then I can update on my end and test it that way. Thanks!

Sicos1977 commented 3 months ago

Get the latest nuget package

Xenophage666 commented 3 months ago

Hi Sicos1977, I believe there is an issue with this. The SentOn time is being pulled from one location, and the TimeZone is being pulled from another.

Going from my above example, the SentOn field is returning "Mon, 10 Apr 2023 10:09:34" which has the the -0700 time zone. But the TimeZone field is returning -0600, which is coming from this field: Date: Mon, 10 Apr 2023 11:07:08 -0600

So the times are different.

The code in the SentOn property has this: _sentOn = GetMapiPropertyDateTime(MapiTags.PR_CLIENT_SUBMIT_TIME) ?? GetMapiPropertyDateTime(MapiTags.PR_PROVIDER_SUBMIT_TIME);

But the TimeZone does not, so I believe this is where the discrepancy is. We need to make sure they are pulling the time from the same location. Maybe add the above code to the TimeZone property as well?

Thanks!

Sicos1977 commented 3 months ago

There is no timezone info in the PR_CLIENT_SUBMIT_TIME or PR_PROVIDER_SUBMIT_TIME ... this is always local time so I cant use that for timezone information. The only place where I can find timezone information is in the headers.

Xenophage666 commented 3 months ago

Hi Sicos, I was digging into this more, and I found in the Storage.cs file, the method: private object GetMapiPropertyFromPropertyStream(string propIdentifier)

This has a switch statement to return the various data types, but I found this for the DateTime type: case PropertyType.PT_SYSTIME: var fileTime = BitConverter.ToInt64(propBytes, i + 8); return DateTime.FromFileTime(fileTime);

I believe the DateTime.FromFileTime() is what's converting the time to a local time. Is there another method we can use that would allow us to retain the time zone?

It just seems to me that since the msg file contains TimeZone info in the source data, that we should be able to get that. When you open the msg file in outlook, you can see the details, you can see the times and time zones, as flat text. It would be nice to have a TimeZone_SentOn and a TimeZone_ReceivedOn. Something like that.

Let me know if this is possible. Thanks!

Sicos1977 commented 2 months ago

I changed all DateTime properties to DateTimeOffset properties so you can now do whatever you want with a DateTime field from whatever TimeZone you are in.