Closed blipblitz closed 9 years ago
Due to email being shitty, there's no perfect way to pull out the quoted text. Mailgun does a really good job a very large percentage of the time. I think the best we can do is some sort of monitor on communications getting too large, and pruning them down before they get out of control. We can also limit the size of these fields, but its hard to come up with a hard constraint on the longest message.
Let's set a max incoming size of saved communications at 5mb (obviously this should apply to attachments, just the message body), and cut it off if it's longer than that. Once the tasks system is up and running, we can create a task each time something hits that limit for a second look.
If it's simpler to do it by character count, let's save 5,101,000 characters which should be enough to cover. We'll see how that works, and adjust up or down as needed.
@morisy 5 million still seems very large and will cause pages to crash. Remember this is per communication, a single request may have dozens of requests all at the max size. I'm thinking we may want something more along the lines of 100k... is there any legitimate reason to have a single communication be more than 100,00 characters?
https://www.muckrock.com/admin/foia/foiarequest/1646/ is a legit request with a communication of length 141k, but is a somewhat hackey use case
MR1646 was a one-off experiment done years ago, and I wouldn't have a problem with it getting cut off if it means generally stronger site-wide stability.
This is related to #198.
https://www.muckrock.com/foi/united-states-of-america-10/fy2013-foia-log-bureau-of-prisons-8376/ This request is one of those that seems to pull all of the quoted text sometimes and then not all of the appropriate text other times. It throws app errors when I try to delete the extra text from past communications.