Open brendanheywood opened 11 years ago
I had not thought of IMAP and cron, which pretty much means we have all the pieces of the puzzle except for time right now. we should start tracking growth in messaging which could be a trigger to prioritise this work. In the past I have piped email straight into a process.
Yeah the piping was my natural instinct too. It would still work fine, but I prefer the IMAP way because it means the whole web server can be down and email won't get dropped. The IMAP way for the work project was critical because most of the users of the software don't have acces to their email server so it has to work out of the box.
More notes:
Some gmail internals: http://support.google.com/mail/bin/answer.py?hl=en&ctx=mail&answer=1311182
Gmail - bulk senders guidelines https://support.google.com/mail/bin/answer.py?hl=en&answer=81126
Perl lib for signing emails http://search.cpan.org/~jaslong/Mail-DKIM-0.39/lib/Mail/DKIM.pm
Deb package: http://packages.debian.org/sid/libmail-dkim-perl
More general reading and advice: http://www.codinghorror.com/blog/2010/04/so-youd-like-to-send-some-email-through-code.html
Reverse PTR check - pretty sure this fails for our server but not exactly sure what things need to match up.
DNS lookup: www.thecrag.com -> 212.124.123.214
dig -x 212.124.123.214 - this fails
Using Online PTR lookup: http://emailtalk.org/PTR.aspx
www.thecrag.com -> 203.10.1.146 -> mulgara.westnet.com.au
I suspect that the because mulgara != thecrag we miss the PTR check but I'm not sure.
This one is dependant on #879 and is less important but very nice to have. For the 'involved' group messages, to feel like a proper email group discussion, reply-to has to work.
We've discussed this a little and it was in the 'too hard, long way down the track' bucket but I've since been involved with two projects at work that implement this and it's not as crazy as it seems to implement. I would have originally tackled this with custom hooks in postfix etc (how I had done stuff like this before), but this architecture below is way more simple and more robust.
When a message is sent
1) Create a normal email account somewhere like chat@thecrag.com 2) Make sure whatever email server behind the scenes allows email subaddressing http://en.wikipedia.org/wiki/Email_address#Address_tags 3) whenever an email is sent, make a hash of the the message id, and the users id and some salt and we''ll call the the reply-to id 4) send the email with a reply-to of chat+5848236326447@thecrag.com
When a message is read in an email client (optional)
1) have a 1x1 pixel beacon image that marks the comment as read on the server
When someone replies to chat@thecrag.com is just gets stuck into the inbox
1) Don't need to do anything, but we could fire off an event to kickstart the next step instead of waiting for a cron using a postfix hook
Regularly, say in a 1 minute a cron job
1) connect to the mailbox via IMAP, do some process locking as needed 2) grab the oldest unread email and move it to a 'processing' email folder 3) retrieve the sub addressing tag 4) decrypt it to get the message id and user id 5) check the from email address with the users email 6) and the message id and check permissions 7) parse the email, extract the interesting bits 8) push that into the messaging system 9) move the email message into a folder where it will eventually get deleted after some time 10) rinse and repeat
If at any point it fails, move the message into an IMAP folder, one for each class of error, for easy diagnostics. Server rules can be in place to empty these folders after a month or so.
If the email doesn't match the users email they may have forwarded it to another email they own. So bounce something back to the user with a link to where they can link their second email address to their account. Because both the message id and the user id are stored in the reply hash email clashes are handled. Because of the hashing, a forwarded email cannot be replied to by a 3rd party.
The only slightly hairy step is 7) and there are a bunch of good libraries out there that do exactly this. Because the processing is broken down in to clear layers, the only step that actually needs to interface with our existing code is step 8) and this could even be done by the API so we can use the best library around without needing to find a perl one. The best one I know is github and they have open sourced it!
So above is the minimal viable product. Some other niceties:
Libs:
Ruby: (made by github!!) https://github.com/github/email_reply_parser
PHP port: https://github.com/willdurand/EmailReplyParser
Python port: https://github.com/zapier/email-reply-parser
No perl port I can find :(