Open owais opened 9 years ago
I agree. At the moment however the features are changing almost daily, so I'll wait a little before I commit to the docs. It's barely been a month since I started this. We can use this issue to gather bullet points and discuss/improve/describe them until then.
Django native
Aims to re-use as much of Django's modules as possible and the whole path can be debugged inside the django environment. i.e. Each cluster has a copy of your Django project, making development much easier. Leverages benefits of existing Django apps lke Django-Redis.
(I could have probably made this a standalone python project, but I wanted it to be fully integrated with Django. This is key)
Easy setup
Needs no special steps to work with Django. Creates and manages its own cluster which negates the need for setting up worker clusters with Supervisor, Circus, Honcho or Foreman.
Although the worker itself is probably marginally faster, Django Q icreates and manages a cluster of them. Since the cluster already contains a copy of your Django environment, there is no need to create a new env for each task execution.
I'm a long-time Celery user and now seriously considering giving Django Q a try, just waiting for the right moment but some comparison documentation would be great indeed! :+1:
I'll see what I can come up with this weekend. Meanwhile, are there any comparison questions that spring to mind?
I really think your project has a lot of potential and that part of the documentation is essential, so it's great to see that you're willing to write about it!
The points you showed above are the most important IMHO, I'd just give more insight about them. It may be interesting to also get some real benchmarks. Broker support comparison could be useful as well (AFAIK Django Q only supports Redis). A security section would be also great (you already mention that it uses encryption so it could be worth mentioning it in this section).
Those are some quick ideas but probably one can gather more just by looking the Celery docs.
The problem I have, is that Celery has amassed such an immense number of features over the years that I'm not sure which one I should be comparing with. This project focuses on integration with Django, so it would be a subset anyway. What I really need is from you, the potential users, is to list the things you love and hate about using Celery with Django. We can then make a better case for its use and add some features. Just keep in mind that I never wrote this as a Celery competitor. I wrote this to make async tasks easier in Django projects.
Django Q's pypi source is 21 Kb, Celery is 1 Mb. I'm sure I'm missing some features.
Redis is used solely for two things:
Nothing more.
Memcached is not very good at the first one. RabbitMQ is pretty slow at this and would add a lot of overhead, plus it's not something most people have in their stack anyway.
This leads me to the security part. Tasks (and statistics) are first serialized with Pickle into a bytestring and then signed with Django's signing module. This basically creates a checksum signature of the pickled task which is then hashed together with the task using your projects secret key. When a worker pulls a task from the queue it first decrypts the package and compares the checksum with the serialized content. So not only is it quite hard to read the data in a package for a potential hacker, also any tampering with the string would be detected.
The disadvantage (or advantage) of this is that the task data on the redis server doesn't make any sense to any other software. It is a closed loop. So my question would be; What advantages would multiple broker support give you, other than the convenience of existing infrastructure?
I started a thread on the Django subreddit which will hopefully lead to a more comprehensive comparison.
Integration with Django
Some people seem to think it integrates so well you really don't need Django-Celery. Other people depend on this packages features, but worry it isn't developed anymore. Django-Q is first and foremost a Django app. This is what I got so far:
Celery beat
Having to run a separate process for schedules is experiences as being a nuisance. Django-q integrated this into each cluster.
Amount of configuration options
Some people find this a positive, others are confused by them.
Amount of configuration needed
Doesn't run by default options, See previous point. Django Q should work without configuration except the location of your Redis server if it's not local.
Multiple broker choices
People seem to love to be able to run the broker of their choice. In response to this I'm currently developing a flexible broker backend and I've added a Disque and a SafeRedisQueue broker to complement the existing Redis broker. Future brokers will include Django ORM, Postgresql, IronMQ and others.
Reliable, proven
Can't argue with this. Give me a year.
Dependent tasks / Workflows
General consensus seems to be that it's a nice feature, but poorly implemented. This can be simulated in Django Q with result hooks and groups, but might need some love.
Monitoring
The majority finds the monitoring sufficient to insufficient. I'm not sure Django Q does this any better at the moment. Warrants investigation.
Is there someone who wants to do a benchmark comparison? I feel I'm not knowledgeable enough about Celery to do it justice in a benchmark. Also I might not be perceived as impartial.
I've been doing my own performance tests with the Parzen Async example code, but I have no idea how to replicate this in Celery.
Another test I often run is:
def countdown(n):
while n > 0:
n -= 1
def get_username(user):
return user.username
def qtest():
u = User.objects.first()
for i in range(500):
async(countdown, 10000 * i, save=False)
async(get_username, u, save=False)
This one is simple enough and puts a nice bit of strain on the workers, broker and Django backend.
So now we have 5 dedicated broker types, plus support for several database brokers via Django ORM. Not via AMQP simulation, but direct dedicated support.
Another difference I spotted is Django Q's ability to execute any python or third party library directly without decorators or pre-loading. This makes it very easy to execute shell commands for example
IMHO this issue can be closed now that we have a good comparison that can be included somewhere more visible, which will definitely help a lot to decide what to use for newcomers (myself included). Great work!
@Koed00 hi! Thanks for the great lib!
Is django-q production-ready?
@DataGreed currently I'm using django-q in production for 2 projects.
I haven't found any problems so far, except that I have had to add the job manual in the backend. This happens because the add option from django-q currently only adds and doesn't check if the current task is already in the database
@DataGreed
I've personally been using it in several commercial projects over the last 6 months or so. One of them has users in the tens of thousands and is used to send emails, live Haystack indexing, cache invalidation and handle cascading model signals. So far I've encountered very few problems. I recently added Rollbar support which directly reports any problems with tasks from any of our servers to my Rollbar account, this has helped a lot to track down and fix any problems we've had quickly. Is it production ready? I don't know. It's stable enough, but I'd love to add and expand the features before I take it out of beta status.
Another big difference with Celery that's become more obvious lately, has been AMQP's need for workers acks to be in process. With that I mean that pulling a job and acking it, has to happen in the same connection for AMQP otherwise the job is considered not acked and will be available for the next worker. This stems from AMQP's legacy as a banking protocol. Django Q's design takes a very different approach. The pulling, executing and acking is asynchronously done by individual processes, separated by multiprocessing memory queues. This gives you much more flexibility when dealing with long running processes or processes that rely on outside services to complete, without tying up your broker.
Something I like very much about Django Q is that I can use the Django database as a broker. This is 'available' in Celery but is outdated and has many known bugs.
A reason why I like this is security. I have a application where I use certificates to encrypt data. Because this process is slow I use a background task but Celery either requires me to run a extra process (redis) on the server to handle the queue or I have to work with a buggy broker.
We are using django-q
in production at the Indianapolis Museum of Art to process daily changes to our online collection imagery and metadata from several outside systems. We have been very satisfied with it.
We could have used Celery, but as a small dev team we see much appeal in small stacks, easy setup, and few dependencies. We process about 500k tasks
per day, and it's nothing that python and a common database (postgres) can't handle.
So cool to hear people actually using it besides myself :)
Main selling point for me was django-admin integration. Plus the ORM broker, but this is secondary (for my current project, anyway). Celery has deprecated its admin integration in favor of standalone "flower" interface, which is all nice and good, but isn't what I need at the moment.
Hello, one more interesting question (at least for me it's clear why celery is bloated for small projects) is Why not django-rq?
From what I see in the docs, django-q by default uses the redis broker, so, if I don't want to use a different broker why should I choose django-q instead of the combination of rq and django-rq?
Thanks !
@spapas after almost 2 years (yeah very slow response)
from what I can see django-rq
is designed to use Redis as a queue and then processes everything in order they are queued.
If you would only need that feature I think django-rq
can serve your needs.
@Eagllus better late than never :)
I am using these features, yes but I'd really like to know which are the extra features that django-q offers? They may be useful for some projects!
I'm a first time user of task-queues and am trying to consider whether I should be using Django Q or Celery or something else. Any links to discussion will be appreciated.
👍 bump.
As of today, the available discussions on the web seems to be 3-4 years old. Will be great to have a comparison between:
There is also a lightweight newcomer: django-simple-task.
As @shawnngtq pointed out, an updated discussion would be great for newcomers like me. I know what I want to do but can't decide what is best for me to go for it without a decent comparison, it will be helpful for sure.
Ditching celery
for following.
Using django-q
for sending scheduled emails.
tbh both Django-q and celery are great, I decided to go with DjangoQ because of simplicity and easy integration with Django. Celery requires to spin up a RabbitMQ for message handling, and I do not want to add extra complex things to manage as part of my stack...
I am really interested in this package since I don't want to have to use Redis with "django-rq" but this repo hasn't received any updates in over 2 years and is still missing a basic list comparing the benefits of it vs. the alternatives.
Just found there's a maintained fork of this repo!
It would be nice if the docs contained a brief comparison with similar libs or motives behind django-q. Right now, celery is the go to system for most django developers out there. It would be nice to know what django-q intents to do differently.
Cheers.