flashmob / go-guerrilla

Mini SMTP server written in golang
MIT License
2.79k stars 366 forks source link

Easy way to decode Quoted-Printable encoding (and detect if is need) in backend #173

Closed lord-alfred closed 5 years ago

lord-alfred commented 5 years ago

Anybody known easy way to dectect emails in Quoted-Printable encoding and decode this to strings?

I installed guerrilla to production and get all emails in quoted-printable πŸ˜„ Previously I use postfix with parsing on php.

PS: Or the best way is to not decode all emails in guerrialla and leave this job for API (decode only requested emails)?

flashmob commented 5 years ago

For parsing the messages, this library looks good. https://godoc.org/github.com/jhillyerd/enmime

or another one here: https://godoc.org/github.com/emersion/go-message#example-Read

lord-alfred commented 5 years ago

Thanks, I think best way - parse mails in API (and cache if it needed). Not in guerrilla backend, because this will create an unnecessary load (because I get more than 50k emails per night now).

flashmob commented 5 years ago

Yes, sounds like a good way to proceed.

50k? That's quite a lot!

Currently, in production on GuerrillaMail, this software is snapping up 150k per hour.. Thefore, the emails that are about to land in active inboxes are parsed, which is a small subset.

lord-alfred commented 5 years ago

50k? That's quite a lot!

I have a personal server for receiving mails, not a public decision like yours. And this is a night load, during the day it is several times less. At night, we receive newsletters from several services where I have registered more than 50 thousand accounts. πŸ˜„ As I wrote somewhere in issues, postfix could not cope with the load and fell, which is why I am using this package now. πŸ‘

Currently, in production on GuerrillaMail, this software is snapping up 150k per hour

Wow! This is a very large volume! πŸ‘ If it’s no secret, what server hardware is currently being used? I am currently using a cloud-based VPS with this parameters: https://i.imgur.com/ZB7KhBQ.png - priced of this VPS ~ $10 per month.

flashmob commented 5 years ago

Currently using a bare metal server from OVH with 128GB of RAM. So the cost is much higher than a small VPS.

The initial emails are placed in RAM (using the Redis & MySQL backend) then later decided if they are to be persisted on SSD or not. Majority are not. Keeping it in RAM makes it super fast. Actually, the new_mail table is using the MEMORY engine. Of course, the mail is lost when if power is lost, but that rarely happens, if ever, and a little loss can be tolerated sometimes. If it does need to be rebooted, there is a script that saves and and restores on boot.

On Thu., 15 Aug. 2019, 23:41 Lord Alfred, notifications@github.com wrote:

50k? That's quite a lot!

I have a personal server for receiving mails, not a public decision like yours. And this is a night load, during the day it is several times less. At night, we receive newsletters from several services where I have registered more than 50 thousand accounts. πŸ˜„ As I wrote somewhere in issues, postfix could not cope with the load and fell, which is why I am using this package now. πŸ‘

Currently, in production on GuerrillaMail, this software is snapping up 150k per hour

Wow! This is a very large volume! πŸ‘ If it’s no secret, what server hardware is currently being used? I am currently using a cloud-based VPS with this parameters: https://i.imgur.com/ZB7KhBQ.png - priced of this VPS ~ $10 per month.

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/flashmob/go-guerrilla/issues/173?email_source=notifications&email_token=AAE6MP3K3ZDQQ66G5NL5JDTQEVTJDA5CNFSM4ILZKBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4L72BY#issuecomment-521665799, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE6MPY23UQS2R3SJH6KEZDQEVTJDANCNFSM4ILZKBSA .

lord-alfred commented 5 years ago

Awesome hardware! πŸ™ˆ

I considered using Redis to store letters, but I abandoned this idea because all emails would then be stored in memory. So I only use MySQL (InnoDB engine) for now, but I plan to upgrade to PostgreSQL in the future. Earlier, I often received dead mysql process from OOM-Killer. This is due to the fact that I constantly write messages to the database and delete them after half an hour, but indexes and other data are not automatically cleared. Because of this, I have to run a script every morning in crontab, which does the messages table optimization. It saves, but it’s not at all an ideal solution and I don’t like it.

In general, MySQL (with InnoDB engine) is not intended for such use (permanent inserting and deletion), I somehow looked for databases for such tasks, but all of them stored data in memory (like Redis), and this requires a lot of expenses on the server - so for now I'm leaning to the fact that PostgreSQL (with PgBouncer) will be the best solution (but, of course, there may be other problems).

flashmob commented 5 years ago

Understand. Unfortunately, once you start going to disk, you take a large performance hit, especially on mechanical disks.

Yes, the index rebuild is something to keep in mind. That's why the insert statements are batched and multiple rows are inserted in one query. You could try to experiment by having more rows per batch.

On Fri., 16 Aug. 2019, 00:55 Lord Alfred, notifications@github.com wrote:

Awesome hardware! πŸ™ˆ

I considered using Redis to store letters, but I abandoned this idea because all emails would then be stored in memory. So I only use MySQL (InnoDB engine) for now, but I plan to upgrade to PostgreSQL in the future. Earlier, I often received dead mysql process from OOM-Killer. This is due to the fact that I constantly write messages to the database and delete them after half an hour, but indexes and other data are not automatically cleared. Because of this, I have to run a script every morning in crontab, which does the messages table optimization. It saves, but it’s not at all an ideal solution and I don’t like it.

In general, MySQL (with InnoDB engine) is not intended for such use (permanent inserting and deletion), I somehow looked for databases for such tasks, but all of them stored data in memory (like Redis), and this requires a lot of expenses on the server - so for now I'm leaning to the fact that PostgreSQL (with PgBouncer) will be the best solution (but, of course, there may be other problems).

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/flashmob/go-guerrilla/issues/173?email_source=notifications&email_token=AAE6MP57GET4IPOMRTEKLQDQEV375A5CNFSM4ILZKBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4MGRGY#issuecomment-521693339, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE6MP3Z3VDASSMZAY7VEA3QEV375ANCNFSM4ILZKBSA .