Closed frasertweedale closed 5 years ago
Assigned this to myself. As a noob myself on this, I guess I'll use this to document some of my findings and how I'm about to proceed. I guess a) to document for myself the way I approach it and b) a log for myself to learn from possible mistakes which will most likely happen.
I've got a property test to check that the round trip between parsing/serialisation works. This obviously fails, but I'm now a bit stuck in which way I should go. Most of the types in Data.MIME
are folds and getters. Rendering a MIMEMessage
would be way too low level, so we need something higher. The tricky bit I'm currently unsure of is: do we change the data type:
data MIME
= Part B.ByteString
| Multipart [MIMEMessage]
deriving (Show, Eq)
or do we change the Getter
s to... ??. My current thought would be to add a few more fields to the Part
constructor (encoding, content type, headers, filename?), yet the parser would need adjustment. That seems to be counter-intuitive as well, since speed would most likely take a hit, for not much gain (we're not constructing new mails, when parsing).
Maybe I need to sleep over it and think ...
@romanofski the MIME
data type is not intended to be used on its own, rather as the payload type of Message
, i.e. Message MIME
. The Message
carries all the headers for the MIME
payload which may be a single part or multipart.
So yeah, we need to render the MIMEMessage
. Why do you suggest that is too low-level?
AAahh I think I understand. My mistake was to expect the part itself will have to carry headers, which is not the case. For example, if I have a multipart mail, with a Content-Type: multipart/alternative, I would effectively have (hope I get it right, just for illustration):
MIME Headers (Multipart [(MIME Headers (Part <>) -- text/plain,
MIME Headers (Part <>) -- text/html])
@romanofski that's pretty much the shape of it.
I've been thinking and reading up on this a bit. My initial idea was to render whatever the MIMEMessage is composed of. The main motivation is to render the mail similar to the way we parse it.
Then use helper functions for composition: adding parts, attachments, etc. One huge drawback of this however is, that I need to basically pre-render parts of the message during composition. That is slow and cumbersome.
One other idea I had was to overwrite the constructors with newtype
in order to add additional information (e.g. attachment or inline, description, filename, yadda yadda), but dismissed the idea.
@romanofski what do you mean by "pre-render parts of the message"? Can you give an example of when/why this would be necessary?
Lets assume you add an attachment (bear with me, the interface might not be how I would end up implementing it):
addAttachment :: Text -- ^ content type
-> FilePath
-> MIMEMessage
-> IO Mail -- ^ IO for using the random number generator in order to compute boundary
By "pre-rendering" the mail I'm referring to the fact that I will have to construct a Part with a bytestring, which probably comes from a "readFile" invokation. If I attach a bigger file, wouldn't that readFile
take some time in order to construct the Part
with it's bytestring? Maybe I'm mistaken and the file is only read upon rendering the mail.
@romanofski no, you are correct. But, is this really too high a cost, to read a file of the filesystem? I think probably not. Keep it simple.
But if it turns out to be too expensive, there are a couple of ways we could address it. IMO let's cross that bridge if/when it matters, and just do the simple thing first.
Alright good to know that I gotten a bit too worried. The benefit of just rendering a MIMEMessage is, that in it's current form it's really simple.
I'd like to talk about how to avoid the IO
for MIME boundary randomness. IMO we should seed a RNG at program start time and put it in the AppState
. Essentially it can be an infinite list of random, valid characters. When we create the MIME boundary we can just pull n chars of the front of that, set the rest of of the list back into the AppState
, and there you go.
But just leave as IO for now I guess, we can refactor later.
@frasertweedale Yeah I'd like the idea of keeping the IO out of the serialisation.
One other aspect I thought of leaving out for now is handling multipart/alternative
. I think for now focusing on multipart/mixed
should be sufficient to get us going. Unless I'm missing a very valid use case for alternative representations of the same content, I would leave it for later.
Hm on a closer look this might be a no-brainer. I'm still figuring out which way to go:
Multipart
is constructed, which is added by a Content-type
header. This however has the downside, that I'll somewhat need to "extract" the boundary out of the header value. Not hard, but well... stringly typedItem 2) seems to indicate that I should go with 1). I currently have a form of 2) so lets give 1) a go. Should probably also take into account that the rendering doesn't generate the boundary, but it's passed into the function so we can get rid of the IO interaction.
I just found the ContentType
constructor. Maybe idea 1) isn't as dire in terms of typing than I thought ...
Alright I have a renderer, which if existing mails are parsed, spits them out identically. I'll see to add the "high level" API and driven by the demand we need for purebred.
On Wed, Aug 15, 2018 at 10:09:16PM -0700, Roman Joost wrote:
Alright I have a renderer, which if existing mails are parsed, spits them out identically. I'll see to add the "high level" API and driven by the demand we need for purebred.
Nice @romanofski. Happy to do a review of what ya got so far, if you want.
Cheers, but I just realised that I've forgot to properly fold field bodies. I also wanted to add a property test, which at this point is still failing :)
Bit stuck, shouldn't:
view (headers . contentDisposition . filename) $ Message [("Content-Disposition", "attachment; filename=foo.pdf")] (Part "foo")
return "foo.pdf"? Unless I've stuffed something up, it returns me an empty string or a Nothing
if I use preview
. I can see that it tries to give me a T.Text which it does through charset decoding? Does that mean for Attachments I always need to specify the encoding?
@romanofski looking into it.
I figured it out (or rather I think I do). So it tries to charset decode the file name, but for an attachment with a Content-Type: application/whatever
there is no charset specified. Which then ends up being a Nothing.
Maybe better to just return a Bytestring
?
No, we should return Text
in case it is encoded in some alternative charset. I think it's a regression from when I implemented parameter continuations. I'll have a fix soonish, hopefully.
No worries. I've written myself a little workaround. Will use that in the mean time and then use the proper fix. Cheers!
@romanofski how does https://github.com/purebred-mua/purebred-email/commit/d12a48ccf8cb4823783f5d0a8651acf45e0c5b03 go for you? If you like it you can push to master.
Nice. Works like a charm :) Push, push!
@romanofski pushed. I should really write some tests for this basic stuff.
@romanofski FYI, from our discussions yesterday: I'm about 1/2 of the way to having "writeable" optics all the way from message down to decoded (i.e. non-ASCII) filenames. The optics also support deleting. See https://github.com/purebred-mua/purebred-email/blob/c8380f43ed3abf911291837b62cbc118b97afdf9/tests/Headers.hs#L145-L164 for an example of what's possible now.
The stuff that's missing is:
Once that's done, we should be able to read and write deep down into the MIME data via the same optics. For example, drill down into the Content-Disposition filename parameter and drop leading path components ;)
I'm a bit stuck putting this together. My understanding that we should have everything in order to compose this correctly. In order to "loop" a function over every attachment, I was presuming I could use a traverse:
-- would unset all filenames in the contentdisposition header of attachments
setNewFilename mail = traverseOf (attachments . headers . contentDisposition . filenameParameter) Nothing m
but that gives me:
No instance for (Contravariant Maybe)
arising from a use of ‘attachments’
Using a fold:
λ: toListOf (attachments . headers . contentDisposition . filenameParameter) multiPartMail
[Just (ParameterValue Nothing Nothing "/bar/foo.bin")]
seems to work fine. I'm not sure what's exactly wrong or where my misunderstanding comes from.
The documentation for Traversal mentions that a Fold if f is Applicative and Contravariant. I went back to the attachments
function which is a Fold
. Is perhaps my example function above a more specific type signature?
Any pointers I could read/check?
@romanofski new commit incoming. Had to change attachments
to a Traversal
(was a Fold
).
See tests for example of how to do what you want.
Hey @frasertweedale cheers for this. The way I read the map on http://hackage.haskell.org/package/lens-4.17 is that a Fold
is also a Traversal
. Do I take the inheritence here a bit to literally? Does the Fold
produce a single value where as I really do need a Traversal
here, since I do want to apply a function to all attachments?
@romanofski https://github.com/purebred-mua/purebred-email/commit/482f5f5780880bd708f439859323a3b2fd9f3dc1.
You just use set
:
set (attachments . headers . contentDisposition . filenameParameter) Nothing msg
traverseOf
is definitely not what you are looking for.
@romanofski a Traversal
is a Fold
, not the other way (Fold
is read-only).
Traversals and folds both work with multiple targets.
set
with a traversal sets all targets (per "unset all attachment filenames" example in tests).
You can also apply a function over
all targets of a traversal.
In general it is easier to use the view
, preview
, set
, over
helper functions than use the optic directly.
I see. Seems like I did read the map totally wrong. I guess part of the reason I cant use set
on a Fold
is that its simply not defined. I could use view
though which would give me a monadic value. If you use a traversal, you can use set
because a traversal is also a Setter
. Does that sound right?
Yep, changing it to a Traversal
is all it needs and I get a passing build. I guess one last missing bit is implementing the function which passes the ByteString
on to sendmail
. I'll add that and see if it makes Travis happy.
Oh and btw. just in case: I'll wait for your commit, rebase and then push.
@romanofski the commit is https://github.com/purebred-mua/purebred-email/commit/482f5f5780880bd708f439859323a3b2fd9f3dc1
I've been fiddling with the API a bit more. More specificially I'd like to use the Mailbox
data types when constructing a MIMEMessage
. What I currently can't decide, but rather try to figure out if having a function which accepts the addresses (e.g. To
, From
, etc) or use convenient lenses to set the headers. At the moment I tend towards providing lenses to alter the MIMEMessage
, which thanks to your At
and Ix
instances(?) isn't very much.
In case you're opinionated on how the API should look let me know, otherwise I keep playing.
I think we want the optics for sure, and after that you will be able to decide whether a "convenience function" is desirable.
Hrmpf. I'm implementing the optic for setting the Date
field and thought this should work:
headerDate :: Lens' Headers UTCTime
Now I realise that it won't work without involving IO, because the time library uses IO to parse a given string.
I'm now thinking of using:
headerDate :: Lens' Headers ByteString
for now and simply deal with the conversion later. Surely the caller can figure out the current time involving IO by himself?
Found a way, even though not pretty:
headerDate :: Lens' Headers UTCTime
headerDate = lens getter setter
where
getter = parseTimeOrError True defaultTimeLocale dateFormat . C8.unpack . view (header "date")
setter hdrs x = set (header "date") (C8.pack $ formatTime defaultTimeLocale dateFormat x) hdrs
The parseTimeOrError
function throws an exception if it can't parse the date. I'd keep improving this as a backlog item for later.
Roll with it for now, we can refactor later.
I'll close this because the PR #15 was merged.
We have a use case for building emails and serialising them for transmission. Right now purebred is using a different library for that.
Things we need to include or handle: