jstedfast / MimeKit

A .NET MIME creation and parser library with support for S/MIME, PGP, DKIM, TNEF and Unix mbox spools.
http://www.mimekit.net
MIT License
1.85k stars 373 forks source link

Content-Disposition: attachment; filename= "*.xml" #463

Closed 70076541 closed 5 years ago

70076541 commented 5 years ago

Hello everyone I just signed up and apologize if this question has already been asked, but I have not found solutions. I'm trying to develop a C # application for downloading electronic invoices from PEC mail through the use of MimeKit ver 2.1.2 to be able to "download" attachment-type Content-Disposition files

for example:

Content-Type: application / octet-stream; name = IT02826010163_3Xl.xml
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename = IT02826010163_3Xl.xml

but I can only "download" files of type Content-Disposition: inline;

For example:

Content-Type: application / xml; name = "daticert.xml"
Content-Disposition: inline; filename = "daticert.xml"
Content-Transfer-Encoding: base64

is

Content-Type: message / rfc822; name = "postacert.eml"
Content-Disposition: inline; filename = "postacert.eml"
Content-Transfer-Encoding: 7bit

I tried different instructions

For example:

var attachments = message.BodyParts.Where (x => x.ContentDisposition != null && x.ContentDisposition.FileName != null) .ToList ();

or

var attachments = message.BodyParts.OfType <MimePart> (). Where (x => x.ContentDisposition != null && x.ContentDisposition.FileName == "IT02826010163_3Xl.xml"). ToList ();

or

foreach (MimeEntity attachment in message.BodyParts.OfType <MimeEntity> ())
{
}

but nothing, where am I wrong?

jstedfast commented 5 years ago

The MimeMessage.Attachments property already gets all of the Content-Disposition: attachment body parts for you. There's no need to do anything else.

But if you want to use the BodyParts property instead, just do this:

var attachments = message.BodyParts.Where (x => x.IsAttachment).ToList ();
70076541 commented 5 years ago

Thank you for the quick reply but with the past instruction var attachments = message.BodyParts.Where (x => x.IsAttachment) .ToList (); the result is 0 also if for MimeMessage.Attachments property you mean

  MimeMessage message = pop3.GetMessage (i); foreach (var attachment in message.Attachments) {

}

even in this case I had already tried but, the result does not change

jstedfast commented 5 years ago

You'll have to give me sample messages and what results you expect so I can see what is going on. Most likely there is some sort of miscommunication.

70076541 commented 5 years ago

Attached you will find the email from which I would like to extrapolate the .xml documents present inside the body. For obvious reasons I renamed the email to .txt

f7eab5fa-2d10-4775-8b4b-29cd70a332d9.eml.txt

jstedfast commented 5 years ago

The problem you are facing is that most of the XML "attachments" are part of a message/rfc822 attachment, so you need to use recursion to iterate over the attachments of the message/rfc822 attachment(s).

You can do that like this:

static IEnumerable<MimeEntity> GetXmlAttachments (IEnumerable<MimeEntity> bodyParts)
{
    foreach (var bodyPart in bodyParts) {
        var rfc822 = bodyPart as MessagePart;

        if (rfc822 != null) {
            foreach (var attachment in GetXmlAttachments (rfc822.Message.BodyParts))
                yield return attachment;
        } else {
            var fileName = bodyPart.ContentDisposition?.FileName;

            if (fileName != null && fileName.EndsWith (".xml", StringComparison.OrdinalIgnoreCase))
                yield return bodyPart;
        }
    }
}

To use that method, you'd do something like this:

var attachments = GetXmlAttachments (message.BodyParts);
70076541 commented 5 years ago

I had already found something similar on the net

foreach (var attachment in message.Attachments) {      if (attachment is MessagePart) {          var fileName = attachment.ContentDisposition? .FileName:              (attachment.ContentType.Name ?? "attached.eml");          var rfc822 = (MessagePart) attachment;

         rfc822.Message.WriteTo (stream);      } else {          var part = (MimePart) attachment;          var fileName = part.FileName;

         using (var stream = File.Create (fileName))              part.Content.DecodeTo (stream);      } }

but I had abandoned as education

  var fileName = attachment.ContentDisposition? .FileName

it does not find .Filename

jstedfast commented 5 years ago

You might be using an old version of the C# compiler.

Try this instead:

static IEnumerable<MimeEntity> GetXmlAttachments (IEnumerable<MimeEntity> bodyParts)
{
    foreach (var bodyPart in bodyParts) {
        var rfc822 = bodyPart as MessagePart;

        if (rfc822 != null) {
            foreach (var attachment in GetXmlAttachments (rfc822.Message.BodyParts))
                yield return attachment;
        } else {
            var fileName = bodyPart.ContentDisposition != null ? bodyPart.ContentDisposition.FileName : null;

            if (fileName != null && fileName.EndsWith (".xml", StringComparison.OrdinalIgnoreCase))
                yield return bodyPart;
        }
    }
}
70076541 commented 5 years ago

I'm working with Visual Studio 2013 because I have a devexpress library that can not go beyond that version. In an hour or so I'll be in a position and I'll do the test right away.

For now, thank you

70076541 commented 5 years ago

I tried the part you passed me but I see that in var variable variable is passed the name of the document xml but, however, when I go to create the file in reality there is the original mail and not the body of the document xml

jstedfast commented 5 years ago

I suspect that you must have incorrectly copied my code?

70076541 commented 5 years ago

no, the code was the right copy, but I forgot to change the save part from message.WriteTo (); a attachment.WriteTo () or bodyPart.WriteTo (); Thanks now I finally have my xml file I just have to clean it up and then I'll have finished

70076541 commented 5 years ago

by chance in your library already exists the property to revive the 3D characters and the symbol =

jstedfast commented 5 years ago

if you do bodyPart.Content.DecodeTo (stream);, MimeKit will decode that for you.

70076541 commented 5 years ago

excellent and I thank you but, I probably always have a compiler version problem because in the properties of the var bodyPart I can not find .Content there is way to be able to write it in another way?

jstedfast commented 5 years ago

Oops, sorry, I should have mentioned that you will need to cast that to a MimePart first:

((MimePart) bodyPart).Content.DecodeTo (stream);

The other option is to modify the GetXmlAttachments method to return IEnumerable<MimePart> to make things easier and more robust.

70076541 commented 5 years ago

No, you do not have to apologize I'm stiff and mentally closed, but I still have a problem. The decoding now works well, but obviously everything is passed in stream in addition to the content already present. is there an easy way to be able to write everything on one line and be able to switch it to another new FileStream without having to resort to a function to get the result?

not real example

stream2 = ((MimePart) Bodypart) .Content.DecodeTo (stream1);

jstedfast commented 5 years ago

You could write a wrapper function to do that, but there's nothing in MimeKit that will do that.

public Stream GetDecodedContent (MimePart part)
{
    var stream = new MemoryStream ();
    part.Content.DecodeTo (stream);
    stream.Position = 0;
    return stream;
}
70076541 commented 5 years ago

great thousand times thanks. Without you I would still be still. Can I somehow make a donation?

jstedfast commented 5 years ago

Sure, you can make a donation here: https://www.paypal.com/pools/c/857bnxBTXg

Thanks!

70076541 commented 5 years ago

done!

70076541 commented 5 years ago

if I had to decrypt .xml documents with .p7m signature can I use the decryption of your library?

jstedfast commented 5 years ago

Yep, you sure can :-)

Typically a .p7m attachment will have a Content-Type header value of application/pkcs7-mime which means that the MimeParser will use the MimeKit.Cryptography.ApplicationPkcs7Mime class for that attachment (which is a subclass of MimePart).

What you'll need to do is:

var p7m = part as ApplicationPkcs7Mime;

if (p7m != null) {
    MimeEntity original;

    // extract the original signed MIME part
    var signatures = p7m.Verify (out original);
}

What to do with the original MimeEntity once you have it depends on how the messages you are processing are structured. I think these are the most likely scenarios:

  1. The p7m contains only the xml attachment part, in which case you can just cast the original MimeEntity to a MimePart and decode the content (just like any of the other MimeParts).
  2. The p7m is the top-level message body (i.e. the entire message content has been signed) and thus the original MimeEntity is probably (at least in your cases) a Multipart.

If you modify the GetXmlAttachments() method to look something like this, I think it will handle both cases:

static IEnumerable<MimePart> GetXmlAttachments (MimeEntity entity)
{
    var p7m = entity as ApplicationPkcs7Mime;

    if (p7m != null && (p7m.SecureMimeType == SecureMimeType.SignedData || p7m.SecureMimeType == SecureMimeType.Unknown) {
        MimeEntity original;

        // extract the signed content into the `original` mime part
        p7m.Verify (out original);

        // use the `original` mime part as the current entity
        entity = original;
    }

    var rfc822 = bodyPart as MessagePart;
    var multipart = entity as Multipart;

    if (multipart != null) {
        foreach (var bodyPart in multipart) {
            foreach (var attachment in GetXmlAttachments (bodyPart))
                yield return attachment;
        }
    } else if (rfc822 != null) {
        foreach (var attachment in GetXmlAttachments (rfc822.Message.Body))
            yield return attachment;
    } else {
        var fileName = bodyPart.ContentDisposition != null ? bodyPart.ContentDisposition.FileName : null;

        if (fileName != null && fileName.EndsWith (".xml", StringComparison.OrdinalIgnoreCase))
            yield return (MimePart) bodyPart;
    }
}

And then, instead of calling it like this:

var attachments = GetXmlAttachments (message.BodyParts);

you'd call it like this:

var attachments = GetXmlAttachments (message.Body);
70076541 commented 5 years ago

I'm sorry I had to leave the station. Thank you once again I will try to put into practice everything you wrote to me

70076541 commented 5 years ago

I can not get the desired result. whether I pass on the GetXmlAttachments function the body or bodypart the variable   var p7m = entity as ApplicationPkcs7Mime; it will always be empty

jstedfast commented 5 years ago

It sounds like the Content-Type does not match application/pkcs7-mime, then. What is it? Is it a child of a multipart/signed?

70076541 commented 5 years ago

this is how I find it without -mime

Content-Type: application/pkcs7; name=IT08806580968_H7RNF.xml.p7m Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=IT08806580968_H7RNF.xml.p7m Content-ID: JNC1VYM1GXNKH746

jstedfast commented 5 years ago

I've never heard of the application/pkcs7 content-type. I suspect that the client sending these messages is not following the standards.

If you can't fix the sending client to use application/pkcs7-mime, you can probably do something like this instead?

static IEnumerable<MimePart> GetXmlAttachments (MimeEntity entity)
{
    var p7m = entity as ApplicationPkcs7Mime;

    if (p7m != null && (p7m.SecureMimeType == SecureMimeType.SignedData || p7m.SecureMimeType == SecureMimeType.Unknown) {
        MimeEntity original;

        // extract the signed content into the `original` mime part
        p7m.Verify (out original);

        // use the `original` mime part as the current entity
        entity = original;
    }

    var rfc822 = entity as MessagePart;
    var multipart = entity as Multipart;

    if (multipart != null) {
        foreach (var bodyPart in multipart) {
            foreach (var attachment in GetXmlAttachments (bodyPart))
                yield return attachment;
        }
    } else if (rfc822 != null) {
        foreach (var attachment in GetXmlAttachments (rfc822.Message.Body))
            yield return attachment;
    } else if (entity.ContentType.MimeType.Equals ("application/pkcs7", StringComparison.OrdinalIgnoreCase)) {
        var content = GetDecodedContent ((MimePart) entity);
        var p7m = new ApplicationPkcs7Mime (SecureMimeType.SignedData, content);

         foreach (var attachment in GetXmlAttachments (p7m))
            yield return attachment;
    } else {
        var fileName = entity.ContentDisposition != null ? bodyPart.ContentDisposition.FileName : null;

        if (fileName != null && fileName.EndsWith (".xml", StringComparison.OrdinalIgnoreCase))
            yield return (MimePart) bodyPart;
    }
}

This will fix application/pkcs7 attachments to be application/pkcs7-mime.

70076541 commented 5 years ago

unfortunately I do not have the basis to discuss this but, as you say I should contact one of the most famous domains in Italy at least (www.legalmail.it) and tell them that they should perform a fix by transforming the point from Content-Type: application / pkcs7; in Content-Type: application / pkcs7-mime; is correct what I am saying ?. I'm trying to try using the extra part you sent me but

  else if (entity.ContentType.MimeType.Equals ("application / pkcs7", StringComparison.OrdinalIgnoreCase))

I had to put an entity! = null

if (entity! = null && entity.ContentType.MimeType.Equals ("application / pkcs7", StringComparison.OrdinalIgnoreCase))

otherwise going to iterate I was wrong but, in any case also using the procedure .Equals does not enter the instruction

jstedfast commented 5 years ago

I don't know for sure that it's supposed to be application/pkcs7-mime, I just suspect that's what it is supposed to be. I don't know what else it could be.

70076541 commented 5 years ago

in a forum I found this discussion can help you help me?

In practice, the p7m are seen by tb as "application / pkcs7" without the "-mime" final, and treated as octet-stream

jstedfast commented 5 years ago

Treating it as application/octet-stream is a pretty reasonable solution, but it won't help you extract the content if it contains S/MIME-encapsulated data.

If you base64 decode the content, is it XML? Or is it binary data?

70076541 commented 5 years ago

is it XML

jstedfast commented 5 years ago

Oh, if it is just XML, then ignore the ApplicationPkcs7Mime stuff I mentioned above and just stuck with the original solution I gave you and check for the appropriate FileName values and treat it like any other XML attachment.

70076541 commented 5 years ago

I do not understand. How can I treat it like a normal xml but at the same time be able to read the .p7m format?

jstedfast commented 5 years ago

if it is XML when you base64 decode the content, then it's not in p7m format.

70076541 commented 5 years ago

I probably first misunderstood the question. I will try to bring back the content present in the email at the point where it is present in the body to the attachment.

Content-Type: application / pkcs7; name = IT08806580968_H7RNF.xml.p7m Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename = IT08806580968_H7RNF.xml.p7m Content-ID: JNC1VYM1GXNKH746

if I download this attachment I have to rely on the Dike tool to remove the signature and be able to read the xml document in plain text

jstedfast commented 5 years ago

Okay, it seems like there is some confusion somewhere (maybe on my end?).

If the content is actual S/MIME p7m data, then the solution I posted above should work.

70076541 commented 5 years ago

maybe I understood

equals should not be placed as else after rfc822

else if (entity.ContentType.MimeType.Equals ("application / pkcs7", StringComparison.OrdinalIgnoreCase))   but

must be placed before or inside as the .xml.p7m file changes from rfc822.Message.BodyParts

70076541 commented 5 years ago

maybe I understood

equals should not be placed as otherwise after rfc822

else if (entity.ContentType.MimeType.Equals ("application / pkcs7", StringComparison.OrdinalIgnoreCase))

but

must be placed before or inside as the .xml.p7m file changes from rfc822.Message.BodyParts

I'll explain.

in the for cycle in this instruction

if (p7m! = null && (p7m.SecureMimeType == SecureMimeType.SignedData || p7m.SecureMimeType == SecureMimeType.Unknown)

the variable does not enter accordingly

  entity = original;

it will always be null to me so everything else you have posted to me at the moment so I can not work

jstedfast commented 5 years ago

If original is turning out to be null, then it probably means the p7m is broken and not correctly encoded or something.

This is getting very confusing because I'm having to guess what you mean because I don't have a sample message. Are you sure that it is signed and not encrypted?

70076541 commented 5 years ago

attached you will find the original email

4c4bb6b5-ecb0-4a29-b84b-32a636602df1.eml.txt

jstedfast commented 5 years ago

This should work for you:

using System;
using System.IO;
using System.Linq;
using System.Collections.Generic;

using MimeKit;
using MimeKit.Cryptography;

namespace ExtractXmlAttachments {
    public class Program
    {
        public static void Main ()
        {
            var message = MimeMessage.Load ("4c4bb6b5-ecb0-4a29-b84b-32a636602df1.eml.txt");

            var attachments = GetXmlAttachments (message.Body);

            foreach (var attachment in attachments)
                Console.WriteLine ("FileName = {0}", attachment.ContentDisposition.FileName);
        }

        static Stream GetDecodedContent (MimePart part)
        {
            var stream = new MemoryStream ();
            part.Content.DecodeTo (stream);
            stream.Position = 0;
            return stream;
        }

        static IEnumerable<MimePart> GetXmlAttachments (MimeEntity entity)
        {
            var rfc822 = entity as MessagePart;
            var multipart = entity as Multipart;

            if (multipart != null) {
                foreach (var bodyPart in multipart) {
                    foreach (var attachment in GetXmlAttachments (bodyPart))
                        yield return attachment;
                }
            } else if (rfc822 != null) {
                foreach (var attachment in GetXmlAttachments (rfc822.Message.Body))
                    yield return attachment;
            } else if (entity.ContentType.MimeType.Equals ("application/pkcs7", StringComparison.OrdinalIgnoreCase)) {
                var signedData = GetDecodedContent ((MimePart) entity);

                using (var ctx = new TemporarySecureMimeContext ()) {
                    DigitalSignatureCollection signatures;

                    var content = ctx.Verify (signedData, out signatures);
                    var fileName = ((MimePart) entity).FileName;

                    // strip off the .p7m filename extension
                    fileName = fileName.Substring (0, fileName.Length - 4);

                    entity = new MimePart ("application", "octet-stream") {
                        Content = new MimeContent (content),
                        FileName = fileName
                    };
                }

                 foreach (var attachment in GetXmlAttachments (entity))
                    yield return attachment;
            } else {
                var fileName = entity.ContentDisposition != null ? entity.ContentDisposition.FileName : null;

                if (fileName != null && fileName.EndsWith (".xml", StringComparison.OrdinalIgnoreCase))
                    yield return (MimePart) entity;
            }
        }
    }
}
70076541 commented 5 years ago

I'm sorry for my long absence, I was behind the start-up of a new client. the part that now you have gone through works well, but I wanted to understand a bit of things.

the first. When we download xml documents without signature and decoded once opened they are not "correctly" formatted in the sense that all the enter are missing to create the lines so it remains written all on one line while with the documents xml.p7m the decoding takes place in perfect ways too from the point of view of formatting.

The second one. With this system does not download the postacert.xml attachment just out of curiosity. I enclose the three e-mails where two contain postacert.xml

the third. With this system no longer download "smime.p7s"

the fourth. Through your library or other libraries there is the possibility to convert the .xml document to PDF ?.

8f4bd7ac-63f6-4302-90f5-54a301ef5e30.eml.txt 84afe65a-55d8-4b29-a5f8-12e84fad4508.eml.txt a0e59010-78cc-4d31-b467-7cb15ea586a5.eml.txt

jstedfast commented 5 years ago
  1. For text-based attachments, you may need to convert the line endings from UNIX to DOS depending on whether or not they were encoded with CRLF or LF line endings at the source. You can use MimeKit.IO.Filters.Unix2DosFilter to do this.
using (var filtered = new FilteredStream (output)) {
    filtered.Add (new Unix2DosFilter ());
    part.Content.DecodeTo (filtered);
    filtered.Flush ();
}
  1. None of the messages contain a postacert.xml, but they do contain a postacert.eml. The reason you aren't getting the postacert.eml attachments back in the method I gave you above is because the code I wrote instead descends into the eml attachment to extract the xml attachments from within them.

  2. The only reason to care about the smime.p7s attachments at all is if you plan to verify the signature of the entire message, but even in that case, you don't actually care about the smime.p7s directly, what you want is the top-level multipart/signed part (which has the type MultipartSigned).

The README has an example of how you could verify the multipart/signed part if you'd like to do that.

  1. I'm sure there exists another library that can convert xml to PDF, but my libraries don't support that themselves. I would try searching https://www.nuget.org to see if there is something there that can help you do that.
70076541 commented 5 years ago

Ooops! sorry I wanted to say postacert.eml and clear the speech of why I do not see them. Excuse me I was too hasty

70076541 commented 5 years ago

In the network under stackoverflow I found this practical example fully functional. The iTextSharp library can be downloaded from nuget, but I can not replace / pass my .xml file to the createXml function. Would you like to help me to hook up your work to this method I would be really grateful?

private XDocument createXml() { //Create our sample XML document var xml = new XDocument(new XDeclaration("1.0", "utf-8", "yes"));

//Add our root node
var root = new XElement("catalog");
//All child nodes
var nodeNames = new[] { "SR.No", "test", "code", "unit", "sampleid", "boreholeid", "pieceno" };
XElement cd;

//Create a bunch of <cd> items
for (var i = 0; i < 1000; i++) {
    cd = new XElement("cd");
    foreach (var nn in nodeNames) {
        cd.Add(new XElement(nn) { Value = String.Format("{0}:{1}", nn, i.ToString()) });
    }
    root.Add(cd);
}

xml.Add(root);

return xml;

}

private void doWork() { //Sample XML var xml = createXml();

//File to write to
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");

//Standard PDF creation, nothing special here
using (var fs = new FileStream(testFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
    using (var doc = new Document()) {
        using (var writer = PdfWriter.GetInstance(doc, fs)) {
            doc.Open();

            //Count the columns
            var columnCount = xml.Root.Elements("cd").First().Nodes().Count();

            //Create a table with one column for every child node of <cd>
            var t = new PdfPTable(columnCount);

            //Flag that the first row should be repeated on each page break
            t.HeaderRows = 1;

            //Loop through the first item to output column headers
            foreach (var N in xml.Root.Elements("cd").First().Elements()) {
                t.AddCell(N.Name.ToString());
            }

            //Loop through each CD row (this is so we can call complete later on)
            foreach (var CD in xml.Root.Elements()) {
                //Loop through each child of the current CD. Limit the number of children to our initial count just in case there are extra nodes.
                foreach (var N in CD.Elements().Take(columnCount)) {
                    t.AddCell(N.Value);
                }
                //Just in case any rows have too few cells fill in any blanks
                t.CompleteRow();
            }

            //Add the table to the document
            doc.Add(t);

            doc.Close();
        }
    }
}

}

jstedfast commented 5 years ago

I don't understand what you want to do.

70076541 commented 5 years ago

We had remained that my last step was to convert the .xml file to pdf. Browsing the net I found this piece of code that works but, based on the ongoing creation of the xml. I wanted to know if you had any idea how to use this part to pass the whole .xml file obtained from your code to convert it to PDF.

jstedfast commented 5 years ago

Sure:

XDocument xml;

using (var stream = mime_part.Content.Open ())
    xml = XDocument.Load (stream);
70076541 commented 5 years ago

perfect! as soon as I'll be in place I'll try