emencia / emencia-django-newsletter

An app for sending newsletter by email to a contact list.
189 stars 72 forks source link

Added send_newsletter_continuous command and its implementation #53

Closed alexgarel closed 12 years ago

alexgarel commented 12 years ago

permitting a better handling of mass mailing. See https://github.com/Fantomas42/emencia-django-newsletter/issues/36

danielsokolowski commented 12 years ago

Alex can you be so kind and run your implementation in plain English for me? I am not seeing how it works.

I am concern what happens when SMTPMailer code is called again --- does it pause the previous threads? Also where is expedition() defined in line 344 sending[expedition.id] = expedition().

alexgarel commented 12 years ago

Hello danois,

I'll try. I don't really how much of python you know, so I make not too much assumptions. Hope this will be clear enough.

Main command

What the command does is that it starts a thread for each smtp services. If you kill the app, all thread receive an event which tell them to stop, and so finish nicely (we join them before exiting).

It is safe to have separated thread as they do not share anything among them.

SendMailer

In each thread, we first collect all newsletters which use our services (candidates).

We maintain a list of NL ready to be sent (among candidates) in sending. in candidates and sending though we do not have the database NL  but a NewsLetterExpedition instance linked to this NL.

Then we use roundrobin to send one message from each NL in sending and start again.

Between two messages we wait a bit until we filled the minimum delay to be sure not to be too fast (but the delay taken to send the message is accounted so we are not beind schedule).

If the sending list is empty, we wait for a longer time, and reinit the vars taking care about scheduling (so they do not grow indefinitly).

Note that waiting, is not done with time.sleep as we eventually wait for a stopping signal that may come at any moment so we use stop_event.wait(sometime)

NewsLetterExpedition

In order to reuse the existing code which was NL centric I use an object responsible of sending a particular NL (and keeping a state about where we are in this expedition). This is a coroutine (see http://www.python.org/dev/peps/pep-0342/).

Coroutines are simply process cooperating together (not os process though) with a predictable path which is always the same.

    SMTPSender                 |  |      NewsLetterExpedition  
manage smtp connection         |  |
     (and wait) ----------------------> create the message
         ^                     |  |         |
         |  send the message <--------------+
         |              |      |  |
         |              +--------------> compute the status for contact
         |                     |  |         |
         +----------------------------------+
                               |  |

When I init a NewsLetterExpedition to put it in sending, I use expedition() which simply calls the __call__ method. This method will yield (instead of returning, that is send back a value, but ready to continue at same point of execution, hence all state is conserved). So in the diagram above, arrow from right to left are those yield (NewsLetterExpedition yields to SMTPSender), while arrow from left to right are next or throw invocations (to send an exception) (SMTPSender send to NewsLetterExpedition).

If you follow me so far you may have understood that while in candidates I have a NewsLetterExpedition instance, in sending list I have a running process of a NewsLetterExpedition instance (an iterator/coroutine).

Of course SMTPSender just not only talk with one NewsLetterExpedition at a time but with all in sending list.

When a NewsLetterExpedition finished (loop over expedition list finished) it will throw an IterationStop exception so we catch this in SMTPSender and just remove it from sending list.

Concurrency

By the way, I did not put any lock against concurrent run, that means you have to ensure that this script only run once at a time. Cron is not really the good tool there you may better run it with nohup, and use monit or upstart or something like that.

Fantomas42 commented 12 years ago

Amazing !

Thank you very much alexgarel.

danielsokolowski commented 12 years ago

Has anyone tested the above code? While it seems impressive (good job alexgarel) I fail to see the benefits over the current implementation; especially if the spacing out of sends and a check if a newsletter is to be sent on each iteration to avoid duplication is added to the original.

This implementation is not cron friendly as you can not run more than one instance of it, yes? Further more it continuously runs wheres the original implementation runs only when needed --- it checks and exits. Also what is the benefit of the threads ? We are still utilizing one server per newsletter, and it's not like multiple sites are services by one 'SMTP Sender' yes?

In a nut shell, I think merging this in was premature, further more now having 'send_newsletter' and 'send_newsletter_continuous' is a mistake and just plain confusing.

Let's have a proper discussion: I would like to see: cron safe based solution, with proper logging support, and only 'send_newsletter' manage option.

alexgarel commented 12 years ago

Hello Danois,

I experiment the problems of the cron based sender while sending a large internal newsletter to employee of a big company. They where picky about the quota / hours.

Please re-read how you will necessarily have a shift over time, putting you behind schedule while using a cron approach, at https://github.com/Fantomas42/emencia-django-newsletter/issues/36

More over continuous sending does a better job at not charging too much equipments as it better smooth load over time.

More over the continuous mode could easily have a parameter to specify which smtp to take in charge permitting to have one process / smtp server if you want. But I don't think it's important unless you have a lot of smtp servers as network latency may be the more limiting parameter (and not CPU power).

I've only done simple testing no current implementation. I did not automate testing for now, has it would be quite a long job.

I will have a production use of continuous send next month (same job as last year), I may report here if it really works well or not.

danielsokolowski commented 12 years ago

Hi Alex, ok please do post back here your results.

I fail to see how you will get a time shift though, if you are running the cron job 4 times per hour you would get duplicate emails being sent out. When the newsletter expedition list is still being sent and cron job gets called, then your new expedition list will contain those unsent emails in the first batch; so those would be sent twice. If you are running the cron job once an hour then you would get at most the max limit sent in that hour.

In the end I would like to just see one implementation of send_newsletter in emencia-newsletter app. Adding 'send_newsletter' manape.py command is easy and straight forward, if we were to use the continuous implementation can it be made to automatically start whenever the app gets loaded, specified in the init of the module?

alexgarel commented 12 years ago

Hello as I promised I give feed back on my production use case.

I can say that it really all went like a breeze. All my scheduled where precisely respected with email sent regularly. I also was able to stop in the middle and restart without having any problem.

So I really advocate for send_continuous.

I think the previous method (send) could be renamed as send_hourly but it:

danielsokolowski commented 12 years ago

I agree, @alexgarel , personally if backward compatibility is an issue let's add a 3 month depreciation warning. Thank you for getting back and reporting. How did you run the command. nohup? Do you think it's possible to keep simple and 'create' a thread and start send_newsletter_continuous automatically whenever the site get's started.

@Fantomas42 would you please consider phasing out the old send_newsletter, what are your thoughts.

alexgarel commented 12 years ago

@danielsokolowski

Do you think it's possible to keep simple and 'create' a thread and start send_newsletter_continuous automatically whenever the site get's started.

Hum, I'm not a Django expert so I can't tell you, but maybe @Fantomas42 will give a better answer. However I'd say having it as a separate process maybe safer for monitoring.

Fantomas42 commented 12 years ago

@danielsokolowski

Do you think it's possible to keep simple and 'create' a thread and start send_newsletter_continuous automatically whenever the site get's started.

In one word: Argh ! :bomb:

Like @alexgarel says separate processes are safer, and more in the Unix way.

Having the two commands is a good solution for me, specialy for backwards compatibilites.

Regards.

danielsokolowski commented 12 years ago

@Fache, very valid about the compatibility issues, but let’s consider adding a depreciation warning and say it will be depreciated in 3 months? Having two commands doing the same thing but one that is half working and one that is superior and works makes no sense. After all the reason this new sending version came along because the original was unreliable.

@alexgarel if I wanted to use your implementation how would I start it since I can’t use cron? - How do you keep it running on your systems.

On 03/05/2012 11:38, Fache Julien wrote:

@danielsokolowski

Do you think it's possible to keep simple and 'create' a thread and start send_newsletter_continuous automatically whenever the site get's started.

In one word: Argh ! :bomb:

Like @alexgarel says separate processes are safer, and more in the Unix way.

Having the two commands is a good solution for me, specialy for backwards compatibilites.

Regards.


Reply to this email directly or view it on GitHub: https://github.com/Fantomas42/emencia-django-newsletter/pull/53#issuecomment-5490540

Daniel Sokolowski Web Engineer Danols Web Engineering http://webdesign.danols.com/

alexgarel commented 12 years ago

@danielsokolowski

Indeed as I was running batches on small period, I simply run it in a screen. However to do it on a production server I would use something like :

start-stop-daemon --chroot /path/to/command/ --background --pid ednl.pid --user ednl --$CMD prod.py send_newsletter_continuous

where $CMD might be start or stop

And then I will monitor it using monit (but it could be upstart or systemd afaik).