In this issue, we will create a database of emails for testing. 'Database' in quotes, of course; as this will involve no actual databases. Rather, the idea is simply to have a list of emails that are version-controlled, in this repository, that we can use for testing. (And which any developers who want to try the project out can use for testing.) What we want, for each source MIME type, is an email that contains an attachment of that source type.
To an extent, this follows the paradigm we follow on the UChicago Library Wagtail site, where there is a development database that looks similar to the production database, except that rather than having information about actual library staff stored in it, it has information about fictional Star Trek characters in it. It is notably smaller, because rather than having a development database that's comparable in size to the actual development database, the goal is to have one fictional staff member represent every 'staff member possibility' our application logic is meant to cover. Similarly here, our testing database need not be large, but we'd like it to have at least one example of every type of email we expect to encounter in the wild. Then we'll have something to write a lot of our unit test suite against.
Constructing the Test Database
My recommendation is to base each test email on an actual email from one of our collections, but scrub all personal information from it. That is:
all email addresses should be changed to made-up email addresses
all IP addresses should be changed to made-up IP addresses
all sender and recipient names should be changed to fake names
the body of every email should be replaced with different text---as to what, exactly, feel free to follow your bliss
every attachment in an email should be replaced with a different attachment of the same MIME type (so, for example, if you come across a .doc attachment, remove that from the email and replace it with a .doc of your creation, containing whatever dummy text you see fit to have it contain)
The emails should be in individual files, in a subdirectory of the tests/ directory in the project. Could be tests/test_emails, or whatever the assignee would like to call it. I'll leave the exact naming scheme up to the preference of the assignee as well: it could be email1, email2, etc. or something else.
Create Test Database of Fake Emails
In this issue, we will create a database of emails for testing. 'Database' in quotes, of course; as this will involve no actual databases. Rather, the idea is simply to have a list of emails that are version-controlled, in this repository, that we can use for testing. (And which any developers who want to try the project out can use for testing.) What we want, for each source MIME type, is an email that contains an attachment of that source type.
To an extent, this follows the paradigm we follow on the UChicago Library Wagtail site, where there is a development database that looks similar to the production database, except that rather than having information about actual library staff stored in it, it has information about fictional Star Trek characters in it. It is notably smaller, because rather than having a development database that's comparable in size to the actual development database, the goal is to have one fictional staff member represent every 'staff member possibility' our application logic is meant to cover. Similarly here, our testing database need not be large, but we'd like it to have at least one example of every type of email we expect to encounter in the wild. Then we'll have something to write a lot of our unit test suite against.
Constructing the Test Database
My recommendation is to base each test email on an actual email from one of our collections, but scrub all personal information from it. That is:
.doc
attachment, remove that from the email and replace it with a.doc
of your creation, containing whatever dummy text you see fit to have it contain)The emails should be in individual files, in a subdirectory of the
tests/
directory in the project. Could betests/test_emails
, or whatever the assignee would like to call it. I'll leave the exact naming scheme up to the preference of the assignee as well: it could beemail1
,email2
, etc. or something else.