purebred-mua / purebred-email

A fast email parsing library implemented in Haskell
https://hackage.haskell.org/package/purebred-email
GNU Affero General Public License v3.0
23 stars 4 forks source link

Feature/optics and serialisation #15

Closed romanofski closed 5 years ago

romanofski commented 6 years ago

This is based on @frasertweedale's work including the serialisation of MIMEMessages

romanofski commented 6 years ago

@frasertweedale two remaining questions:

Before we merge this, we'll need an additional backlog item to implement transfer encoding for 8bit (quoted printable I guess)

frasertweedale commented 6 years ago

@romanofski re transfer encoding, here are my ideas:

First cut: scan the whole input, if we hit anything that requires transfer encoding, short circuit and use base64. Rationale: large attachments are likely to be binary and short circuit. Text messages are likely to be small so even though we may have to scan the whole thing to learn that it does not require transfer encoding, the cost is likely small.

Alternatively: two functions, one to add ByteString to the message and unconditionally use base64 transfer encoding, the other to add Text (or a ByteString supposed to be a text message) to the message and unconditionally use QP.

Down the track, if we want to be "clever", we can:

Rationale:

frasertweedale commented 6 years ago

I'll look at the cabal stuff soon!

romanofski commented 6 years ago

Hm.. I think my misunderstanding is rooted in the transfer encoding of the message header vs. entities. Default encoding is 7bit. If field values (e.g. from address) use a non-ascii character, I can switch the message tranfer encoding to 8bit, and simply encode the field value with QP. The message body can remain unencoded. The createAttachment function already sets a tranfer-encoding (base64) for the entity. I guess that doesn't leave anything unclear. Let me see if I can already incorporate in this patch ...

frasertweedale commented 6 years ago

@romanofski re failing tests, the travis script current creates source dist, then extracts it and builds & runs tests there. The test-vectors/ were not in sdist hence missing file. I've pushed a fix, waiting for CI now.

edit: and we're green.

frasertweedale commented 6 years ago

I've created lenses for From, To, Date etc. Yet, I can't set the Date header for example, if the structure is missing the key. Does that mean I need a Traversal instead of a Lens? These lenses use the Mailbox and Address instances which would enforce consistency with parsing. I'd be kind of interested to use them.

Yes, you need a Traversal, or alternatively a Lens Headers (Maybe Date) because of the possible absence of the Date header. But a Traversal will not allow you to set the header if it is missing in the map. So Lens Headers (Maybe Date) (i.e. use the At instance) looks like the way to go.

For the other From, To and Date header the absense of the header can be "papered over" by exposing an empty list.

frasertweedale commented 5 years ago

I'll re-review this whole commit tomorrow. I think we can merge it pretty soon hey. Then I'll start work on the encoded-word stuff. Would be good if we can find some new contributors to fix some of the new issues we filed, some of which (e.g. time, process execution) are fairly small and isolated.

romanofski commented 5 years ago

Yes they are. Cheers for reviewing. Looking over the haddock examples, do you think we can enable doctest and have the examples run as tests? I fear otherwise they might go stale... (If so happy to file a backlog item)

frasertweedale commented 5 years ago

@romanofski merged it, push a couple of cleanup commits and a bugfix, and I'm now working on encoded-word support for address headers.

romanofski commented 5 years ago

Sweet thank you! I'll update the purebred side and push again.