Closed Astol closed 7 years ago
Are you saying that the content of this "Untitled Attachment" is a message? I don't really understand what you are asking or trying to do.
Yes, the untitled attachment is a message. The original message I am trying to parse is an email, that has an email as an attachment. But because they are both in rtf format it gets a bit messy.
I can't find any suitable means as to how to continue parsing the email attachment. Did that make it clearer?
Something like this, but I can't get mimekit to handle the rtf formatted mail attachment.
If mime_part.ContentType.MediaSubtype = "ms-tnef" And mime_part.ContentType.Name = "winmail.dat" Then
Dim multi As Tnef.TnefPart = DirectCast(mime_part, MimeKit.Tnef.TnefPart)
Dim parts As System.Collections.IEnumerable = multi.ExtractAttachments()
For Each part As MimeEntity In parts
parse_mail(part, filepath)
Next
ElseIf mime_part.ContentType.MediaSubtype = "octet-stream" Then
Try
Dim part As MimeKit.MessagePart = DirectCast(mime_part, MimeKit.MessagePart)
parse_mail(part.Message.Body, filepath)
Catch ex As Exception
Debug.Print(ex.Message)
End Try
End If
End If
If mime_part.ContentType.MediaType = "message" Then
Dim part As MimeKit.MessagePart = DirectCast(mime_part, MimeKit.MessagePart)
parse_mail(del.Message.Body, filepath)
End If
So where you are going wrong is that you cannot cast a MimePart to a MessagePart. MessagePart does not inherit from MimePart, it inherits from MimeEntity.
What you need to do is decode the content of the MimePart and then use MimeKit.Tnef.TnefReader to parse the decoded content manually.
As a cheat, you could do this (and bear with me because I don't know VB.NET):
var tnef = new TnefPart ();
tnef.ContentObject = mime_part.ContentObject;
// now you can use tnef.ExtractAttachments()
As far as the rest of your code, it is not very safely written.
It would be better to check if the mime_part is a TnefPart instead of checking the ContentType properties. Also, a tnef part might not have a name of "winmail.dat".
In c#, you would do this like this:
if (mime_part is TnefPart) {
var multi = (TnefPart) mime_part;
...
}
if (mime_part is MessagePart) {
var part = (MessagePart) mime_part;
}
Thanks, works a lot better! Although I'm getting an exception from MimeKit when I run my code. Am I handling the ExctractAttachments() wrong? It also reads the attachment twice. Exception: A first chance exception of type 'System.IO.EndOfStreamException' occurred in MimeKit.dll
If TypeOf mime_part Is Tnef.TnefPart Then
Dim tnef As Tnef.TnefPart = mime_part
Dim parts As IEnumerable(Of MimeEntity) = tnef.ExtractAttachments()
For Each node As MimeEntity In parts
parse_mime(node, filepath)
Next
Else
'Because sometimes messages attachments are hard to tell apart
Try
Dim tnef As Tnef.TnefPart = New Tnef.TnefPart()
tnef.ContentObject = del.ContentObject
Dim partsAs IEnumerable(Of MimeEntity) = tnef.ExtractAttachments()
For Each node As MimeEntity In parts
parse_mime(node, filepath)
Next
Catch ex As Exception
End Try
End If
Forgot to say that the exception doesn't actully break the program, it just shows up in the Immediate Window output
testmail.txt this is the test case I am using
The type of the object returned by the ExtractAttachments() operations becomes "MimeKit.Tnef.TnefPart+
What is del
? Why shouldn't it be mime_part
?
If TypeOf mime_part Is Tnef.TnefPart Then
Dim tnef As Tnef.TnefPart = mime_part
Dim parts As IEnumerable(Of MimeEntity) = tnef.ExtractAttachments()
For Each node As MimeEntity In parts
parse_mime(node, filepath)
Next
Else
'Because sometimes messages attachments are hard to tell apart
Try
Dim tnef As Tnef.TnefPart = New Tnef.TnefPart()
tnef.ContentObject = mime_part.ContentObject
Dim partsAs IEnumerable(Of MimeEntity) = tnef.ExtractAttachments()
For Each node As MimeEntity In parts
parse_mime(node, filepath)
Next
Catch ex As Exception
End Try
End If
Yes it is the mime_part in the code, it's originally not written in English so i translated it so it wouldn't look like gibberish before pasting it here, just missed to rename it, sorry!
The exception inside mimekit seems to always happen on the first iteration while looping tnef.ExtractAttachments()
I wrote a simple test program to print out the inner-most text/* parts from your sample message:
using System;
using System.Linq;
using MimeKit;
using MimeKit.Tnef;
namespace TnefTest
{
class Program
{
public static void Main (string[] args)
{
var message = MimeMessage.Load ("testmail.txt");
var tnef = message.BodyParts.OfType<TnefPart> ().FirstOrDefault ();
foreach (var attachment in tnef.ExtractAttachments ()) {
if (attachment is MimePart) {
var tnef2 = new TnefPart ();
tnef2.ContentObject = ((MimePart) attachment).ContentObject;
foreach (var attachment2 in tnef2.ExtractAttachments ()) {
var mime_part = attachment2 as MimePart;
var text = attachment2 as TextPart;
if (text != null) {
Console.WriteLine ("Content-Type: {0}", text.ContentType.MimeType);
Console.WriteLine (text.Text);
}
}
}
}
}
}
}
Here are the results (I did not get any exceptions):
Content-Type: text/plain
Content-Type: text/plain
Testing rtf
Content-Type: text/rtf
{\rtf1\ansi\ansicpg1252\fromtext \fbidis \deff0{\fonttbl
{\f0\fswiss Arial;}
{\f1\fmodern Courier New;}
{\f2\fnil\fcharset2 Symbol;}
{\f3\fmodern\fcharset0 Courier New;}}
{\colortbl\red0\green0\blue0;\red0\green0\blue255;}
\uc1\pard\plain\deftab360 \f0\fs20 Testing rtf\objattph \par
}
Content-Type: text/plain
Testing rtf
Content-Type: text/rtf
{\rtf1\ansi\ansicpg1252\fromtext \fbidis \deff0{\fonttbl
{\f0\fswiss Arial;}
{\f1\fmodern Courier New;}
{\f2\fnil\fcharset2 Symbol;}
{\f3\fmodern\fcharset0 Courier New;}}
{\colortbl\red0\green0\blue0;\red0\green0\blue255;}
\uc1\pard\plain\deftab360 \f0\fs20 Testing rtf\objattph \par
}
Hello @jstedfast,
I was trying to parse tnef body parts, and came accross this example(testmail.txt) above - I see there is an email attached inline to the tnef body part, and I am trying to recurse down and parse it as a message. I believe I can use tnefPart.ConvertToMessage()
for this. However, this method doesn't fail for TnefPart's that isn't an email. What is the best way to determine if this part is actually an email and only then do this conversion?
Thanks in advance
The ConvertToMessage() method is meant to work for all TNEF data and MimeMessage is the closest data structure that MimeKit has that can represent most of the TNEF data that is available.
I see there is an email attached inline to the tnef body part, and I am trying to recurse down and parse it as a message ... What is the best way to determine if this part is actually an email and only then do this conversion?
I guess that depends on what you would consider to be "actually an email". TNEF attachments are never actually an email.
I would recommend taking a look at the ConvertToMessage() implementation in TnefPart.cs and deciding what TNEF attributes you'd consider indicate an "email" and then check for those.
thanks, these attributes will only be available once I do the conversion as I understand. is there a way to check beforehand? (trying to optimize my logic as much as possible)
If you use the TnefReader directly, you could check as it parses as opposed to calling ConvertToMessage().
I am writing a program to gather all attached pdf files to emails, parsing them with mimekit. And I have run into a problem with the case of rtf formated mails. I can get the different parts contained in the ms-tnef fine, but when there is an email attachement inside I can't find a way to handle it. In my case the email attachment is also in rtf format. Also it's hard to identify that the mail is, in fact, a mail. Here is how the object looks at runtime:
{ Content - Type: application / octet - stream; name *= iso - 8859 - 1 '' H % E4r % 20 % E4r % 20 pdf % 20 nr % 202 % 20 att % 20 skicka % 20 vidare Content - Disposition: attachment; filename = "Untitled Attachment"; modification - date = "Thu, 28 Sep 2017 10:35:47 +0200"; size = 25289 Content - Transfer - Encoding: base64 eJ8 + Ii8IAQaQCAAEAAAAAAABAAEAAQ ... + g8BAAAA EAAAABJ4DEdIZ7NAn6T9kixhoFEDAP4PBwAAAAw0AAA = } MimeKit.MimePart: { Content - Type: application / octet - stream;name *= iso - 8859 - 1 '' H % E4r % 20 % E4r % 20 pdf % 20 nr % 202 % 20 att % 20 skicka % 20 vidare Content - Disposition: attachment;filename = "Untitled Attachment";modification - date = "Thu, 28 Sep 2017 10:35:47 +0200";size = 25289 Content - Transfer - Encoding: base64 eJ8 + Ii8IAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADo ... + g8BAAAA EAAAABJ4DEdIZ7NAn6T9kixhoFEDAP4PBwAAAAw0AAA = }
Sorry if I'm missing something!