Flolagale / mailin

Artisanal inbound emails for every web app
mailin.io
MIT License
1.95k stars 152 forks source link

Mailin Failing SPF and DKIM detection #54

Open hellsan631 opened 8 years ago

hellsan631 commented 8 years ago

Just moved over to v3.0.1 from v1.0.1 and mails I am sending are failing dkim/spf validation. Looking through the logs, I am getting some errors from the python scripts. (these emails shouldn't be failing dkim/spf)

2015-09-08T22:02:41.809Z - verbose: f0460130-046f-43a7-a8f0-55298773006b Validating dkim.
2015-09-08T22:02:41.848Z - verbose: f0460130-046f-43a7-a8f0-55298773006b Validating spf.
2015-09-08T22:02:41.868Z - verbose: f0460130-046f-43a7-a8f0-55298773006b Parsing email.
2015-09-08T22:02:42.326Z - verbose:  responseCode=0, responseMessage=EX_OK, isSpam=false, spamScore=0, baseSpamScore=5, report=[score=0.0, name=FREEMAIL_FROM, description=Sender email is commonly abused enduser mail provider (matkle414[at]gmail.com)
-0.0 NO_RELAYS              Informational, type=message, score=0.0, name=HTML_MESSAGE, description=BODY, type=HTML]
2015-09-08T22:02:42.345Z - verbose: [verifydkim.py] isDkimValid = False

2015-09-08T22:02:42.369Z - verbose: closed with return code 11
2015-09-08T22:02:42.538Z - verbose: [verifyspf.py] (none, )

2015-09-08T22:02:42.539Z - verbose: closed with return code 11

There is also an error with the mail path, (referencing the earlier issue #51 i brought us).

2015-09-08T22:02:42.612Z - error: f0460130-046f-43a7-a8f0-55298773006b Unable to finish processing message!! TypeError: path must be a string
  at TypeError (native)
  at Object.fs.unlink (fs.js:878:11)
  at Object.unlink 

It looks like the connection.mailPath is sent in as undefined. I will see to debugging a bit more to try and narrow down both problems.

hellsan631 commented 8 years ago

So I've spent the last few days trying to figure out the my DKIM woes.

Version 1.0.1 on npm (perhaps this is version 2) works for me, about 90%. However, there are some problems with the server randomly restarting when receiving certain emails from specific addresses. There are no errors given, and the server would randomly reboot.

2015-09-10T14:03:05.166Z - verbose: Parsing email.
2015-09-10T14:03:05.211Z - verbose: Detecting language.
2015-09-10T14:03:05.487Z - verbose: Potential languages: [ [ 'english', 0.12698888888888893 ],
  [ 'pidgin', 0.09700000000000009 ] ]
2015-09-10T14:03:05.499Z - info: End processing message.
2015-09-10T14:03:14.469Z - info: Log level set to verbose.     <-Restart

And the 'message' event never triggers. Upgrading to 3.0.1 didn't solve the problem, and only caused more problems (as listed above, DKIM and SPF fails).

So I forked the repo from 1.0.1 (which worked) and re-wrote everything to be more testable (creating callable functions so I can do external tests on all of the bits). I added a couple of test cases, after re-working in some promises to make things cleaner, I got everything (dkim, spf) working.

The problem of the restarts still occurred. So I set about updating all of the packages, and moving the server over to the smtp-server module that you are using on v3.0.1. (I copy pasted a bunch, rewrote a bunch, etc).

I figured out that the way the connection was being out was causing the problems with SPF, and spamd broke along the way. I changed out spam modules from spamc to node-spamd because the node-spamd has more consistent variables. (you don't just send it the raw email and hope), and also reformated connection to make it closer to the old version. There are tons of fall-backs for everything so any email should pass SPF/spamd.

Here is where the weirdness begins. I ran some debug tests on emails that I know pass, and they passed DKIM with flying colors. I thought, great! Push changes to git, and put them on my live linux box for testing. It still fails.

Maybe its file stream problem, so instead of piping the stream, I am writing chunks. Tests pass on local, push to live, Fail.

The only difference for DKIM processing at this point is that the file names are created by node-uuid instead of crypto. Revert that change back. Pass on local, fails on live.

The only difference between the successful DKIM validations i've gotten before, and the failing ones I get now, are the actual server.

At this point, the only thing i can think of is that might be a problem with the promises executing too early, and the file itself stays locked for when its put into a raw format, so it misses a character or something.

Does anyone have any insight into what might be going on? Input is 100% the same between working and failing. They are both using the same NodeJS version, on the same exact system. I can swap them out, but the outcome is the same. simplesmtp dkim works, smtp-server dkim fails.

hellsan631 commented 8 years ago

So, I figured it out.

If you are piping the stream, as it turns out, the stream is still left open, even when the pipe's end event is triggered.

var fileStream = fs.createWriteStream(connection.mailPath);

fileStream.on('finish', function () {
  //this finish event doesn't trigger until after end is called.
});

stream.pipe(fileStream);

stream.on('end', function(){
  //this end event doesn't actually close the write stream.
});

Also, if you are just writing the stream in data chunks, its still left open, and you need to close it, before sending it off to be read/parsed.

var fileStream = fs.createWriteStream(connection.mailPath);

fileStream.on('finish', function () {
  logger.verbose('Data Ready');
  _this.onDataReady(connection);
  callback();
});

stream.on('data', function (chunk) {
  _this.emit('data', connection, chunk);
  fileStream.write(chunk);
});

stream.on('end', function () {
  fileStream.end(); //the key is also closing the file stream.
});

Added this little bugger right here and DKIM begins to work again. Ill see to filing out a PR for this main tree that isn't from my crazy fork of debugging and re-writing everything. Now to just test that I can receive emails from those weird email addresses without crashing.

Flolagale commented 8 years ago

Awesome work! Indeed we moved to smtp-server to be able to use recent node versions, but while it solves problems on the smtp side, it introduced a lot of instability. I'll have a look at your rewrite while waiting for your pr! Thanks!

Flolagale commented 8 years ago

Hey @hellsan631, I like how you extracted all these actions inside MailinReport. It is much more maintainable and testable like this. Mailin really needs a hardening phase since version 3.0.0 and I would love to take this opportunity now. If I create a hardening branch, would you mind opening a PR directly to this branch? Then we could iterate on this PR until it is fully mergeable, becoming mailin 3.1.0.

hellsan631 commented 8 years ago

Sure. do you want my rewrite then? One of the things I was going to start focusing on (and yea, the reason why I extracted those actions) was to test everything thoroughly.