Open vkryukov opened 3 years ago
Hm, this seems to be allowed by the RFC and indicate that the PDF attachment should be displayed with the text of the message.
The RFCs allow very surprising things, such as multiple inline parts mixed together with a different MIME type each.
Thanks for your quick response!
I think this particular set of MIME headers is a violation of the standard, as e.g. RFC 2045 section 5 (Content-Type Header field) prescribes that
For this reason, registered subtypes of text, image, audio, and video should not contain embedded information that is really of a different type.
and RFC 2046 Section 3 says
text -- textual information. The subtype "plain" in particular indicates plain text containing no formatting commands or directives of any sort. Plain text is intended to be displayed "as-is". No special software is required to get the full meaning of the text, aside from support for the indicated character set.
Of course, we will always see a badly formatted email in the wild, and no library can expect to handle all of these cases - "you can't enumerate badness". I'm wondering what is the best way for a library like go-message
to support users who have to deal with all of this weirdness.
Here is one example: we can extend PartHeader
to include ContentType()
and ContentDisposition()
methods, as they can give additional useful information about the nature of the part if Content-Type
in the header is lying like in the example above. This is not necessary, of course - in the current implementation, the following works perfectly fine:
type header interface {
ContentType() (t string, params map[string]string, err error)
ContentDisposition() (t string, params map[string]string, err error)
}
h, ok := p.Header.(header)
if !ok {
// Impossible with the current implementation
}
but it would make writing the user code easier. Another good candidate would be Header.Fields()
to enumerate all the fields in the part header.
If you feel that this is unnecessary, please feel free to close this issue.
Oh. text/plain
with a PDF file, and all in a inline
part. Completely missed that, I thought the MIME type was correct.
FWIW, it's also possible for have non-textual parts with an inline
disposition. For instance Apple Mail will do this to embed images in a plain-text message: the message will contain interleaved text/plain
and image/*
parts, all of which are inline and wrapped in a multipart/mixed
part.
Regarding your suggestion, maybe we can just add these to the existing PartHeader
interface? I wonder how far we should go, there are a few more methods implemented by both types of headers.
Ah, this is exactly what you suggest. Yes, this sounds reasonable.
Cool - why don't I work on a pull request then? I will add ContentType
and ContentDisposition
for now.
Hello,
with the current implementation, interface
PartHeader
is implemented as eitherInlineHeader
orAttachmentHeader
that both embedmessage.Header
.And although it's not documented by the package, the wiki example implies thatPartHeader
will always be one of the two.The logic to decide whether its
AttachmentHeader
orPartHeader
,unfortunately, doesn't always work for emails in the wild. E.g., I have a message with the following part header:
It's clearly a PDF attachment, but is currently classified as
InlineHeader
, so we cannot useAttachmentHeader.Filename
to extract the filename.To solve this problem, the user code can of course cast
PartHeader
asHeader
(which will succeed with the current implementation), and then effectively re-implementAttachmentHeader.Filename
logic by perusingHeader.ContentDisposition
andHeader.ContentType
methods.The alternative would be, at a minimum, to extend
PartHeader
interface to directly exposeContentDisposition
andContentType
, which will help the user code that works with malformed parts. (If this indeed a desired direction, I'm happy to create a pull request).Another alternative would be to improve the Inline vs Attachment logic, but it will probably be brittle as you cannot enumerate badness in the wild.
Thoughts?