Closed mnako closed 3 weeks ago
Thank you for the review, @ValleyCrisps 🙇
@rorycl raised good points in the discussion here https://github.com/mnako/letters/discussions/64, where they mentioned that
[skipping body parsing] dropped my processing time to 1/10th of the time in comparison to full email parsing for about 1700 messages
and in one more place that I cannot find now (not sure if it was a PR comment that was deleted or if I lost track of one thread), where @rorycl said that
sometimes someone might want to check the header of the email with
ParseHeaders
and then dive into it withParseEmail
, in which case it might make sense to store the header info in the
I think it is a very good point.
This might be a minor point, but I still have some objections against a headersOnly
option, regardles of whether we implement that as a string arg, a config struct, or—I really like that idea as referenced by @rorycl—a self-referential function options.
Generally, I think that options of the form xOnly
, skipY
, overrideZ
point out design flaws and highlight places that are not flexible enough. I think that we should offer the option to parse headers only, but in a more coherent way, rather than framing it as an exception to the default flow.
My current thinking is that we should refactor ParseHeaders
and ParseBody
as methods on the Email
struct.
The default behaviour stays the same: the simplest use case is to parse an email completely by creating an Email instance from a reader:
email, err := letters.ParseEmail(r)
if err != nil {
log.Fatal(err)
}
but if somebody has a more advanced use case, we can enable them to create an empty Email and parse headers and bodies separately:
var e Email
err := e.ParseHeaders(r)
if err != nil {
// handle headers parser error
}
err = e.ParseBody(r)
if err != nil {
// handle body parsing error
}
That way,
e.ParseBody()
can be skipped, if needed, ParseX(r io.Reader)
developer-facing friendly API (at the cost of some additional complexities on our side, like perhaps handling or at least documenting the case of trying to read from the same stream twice);mail.ReadMessage()
;This is a pretty nitty discussion, but we have the luxury of taking the time to build a nice library, so let's try to get this nice and elegant.
Thoughts?
Thank you for the discussion, I wouldn't have thought of this approach on my own, and I'm learning a lot just by reviewing and reading your comments.
I like the consistency in the method parameters and the logic encapsulation in the proposed solution.
Since we are exploring alternatives, I'd like to hear your opinion on the following points:
ParseBody
. For example, are attachments included? ContentTypeHeader
is another grey area as it must be parsed even for email bodies. To be clear, my point here is about usability more than duplication. Good documentation helps a lot, but self-explanatory function names are even better.ParseBody
into text (Text, EnrichedText, HTML) and attachments (InlineFiles, AttachedFiles)? In a use-case where performance is a concern, I think skipping attachments would lead to the biggest improvement.Thank you for the review, @ValleyCrisps . Let me merge this first and make a release and we can explore parsing different types of bodies and parsing attachments in a separate PR.
In https://github.com/mnako/letters/pull/72 @rorycl makes a good point for parsing email headers only: the email bodies might be malformed or the user might be interested in headers only.
I do not think, however, that a string arg
letters.ParseEmail(r, "HeadersOnly")
is the best way to implement this feature. I believe that this should be at least a config struct for future maintainability, but for the time being there is a simpler solution: export theParseHeaders()
identifier so that users can use it directly.This PR renames
letters.parseHeaders()
toletters.ParseHeaders()
, adds a test, and updates README.md with an example.One point that might merit discussion, is whether the now public
ParseHeaders()
should not accept the same parameters asParseEmail
so that the user is not responsible for reading the message usingmail.ReadMessage()
and passingmsg.Header
toParseHeaders()
.