Improve memory usage of validator

Two things are making the validator blow up on the super big files:

The duplicate check. Right now is a goodtables check, but should be reimplemented in SQL, by loading the file during the validation phase.
Large error lists are used as the asynchronous job result, which may be too big for Redis. These should probably be stored on s3 by the webapp worker instead, with the S3 path being passed as the job result.

dssg / matching-tool