celery / billiard

Multiprocessing Pool Extensions
Other
416 stars 250 forks source link

get_command_line doesn't set up Django environment? #10

Closed stevage closed 12 years ago

stevage commented 12 years ago

http://stackoverflow.com/questions/11646279/celery-error-no-module-named-billiard-forking-how-to-diagnose/11647902#11647902

I'm getting this error in a Django app that otherwise runs ok:

ImportError: No module named billiard.forking

It comes down to get_command_line running a command line without the full Django environment. If I either add the billiard path to PYTHONPATH, or "pip install billiard", that starts a game of whack-a-mole:

[2012-07-25 21:05:27,690: DEBUG/MainProcess] Consumer: Connection established. 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.6/dist-packages/billiard/forking.py", line 470, in main
    prepare(preparation_data)
  File "/usr/local/lib/python2.6/dist-packages/billiard/forking.py", line 610, in prepare
    file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named django

Here's the Celery report:

software -> celery:3.0.3 (Chiastic Slide) kombu:2.3.0 py:2.6.5
            billiard:2.7.3.10 django:1.4.0.final.0
platform -> system:Linux arch:32bit, ELF imp:CPython
loader   -> djcelery.loaders.DjangoLoader
settings -> transport:django results:database

_wrapped: <django.conf.Settings object at 0x901b46c>

The code is https://github.com/stevage/mytardis

ask commented 12 years ago

Django-Celery and billiard works fine for me here.

Would you be able to whip up a small example project that reproduces it?

You can start out with: http://github.com/celery/django-celery/tree/master/examples/demoproject

mrtrumbe commented 12 years ago

I'm getting the same thing here. This appears to be a problem with how billiard and buildout/virtualenv are interacting. Forking in billiard seems to use the system python unaltered rather that our virtualenv python or a version of the system python with appropriately updated paths. As such, none of the dependencies we pull down with buildout are available in the forked process, billiards included.

I'm not sure how best to give you an example of this, given this is proprietary code. Would a buildout.cfg and bootstrap.py with a dummy src directory be sufficient?

noirbizarre commented 12 years ago

I have the same issue in virtualenv.

ask commented 12 years ago

@mrtrumbe that would be sufficient!

Note that I use virtualenv with billiard and that is working fine.

jbl2024 commented 12 years ago

hi, i also have the same issue on my environment with virtualenv (which is based on https://github.com/jbl2024/django_sample_env).

ask commented 12 years ago

Could anyone give me a small project that reproduces the issue?

jbl2024 commented 12 years ago

I have created a dummy project which reproduces the issue on my machine:

https://github.com/jbl2024/kvtenv

$ git clone https://github.com/jbl2024/kvtenv.git
$ cd kvtenv
$ wget https://raw.github.com/pypa/virtualenv/master/virtualenv.py
$ ./vtenv.sh
$ cd sample_project
$ ./manage.py syncdb
$ ./manage.py celery worker
ask commented 12 years ago

@jbl2024: Thanks! I can reproduce with this indeed.

I already have a feel for what the problem can be, the child processes are trying to load the settings file sample_project.settings, but this can't possibly work since CWD is in the sample_project directory.

For that to work the sys.path would have to be amended to include the parent directory. I will investigate further.

ask commented 12 years ago

Try this and it will work:

$ PYTHONPATH=.. ./manage.py celery worker
ask commented 12 years ago

Alternatively, this will also work:

$ DJANGO_SETTINGS_MODULE=settings ../manage.py celery worker

I wonder if this is something we should spend a lot of time fixing, considering that the new project layouts since Django 1.4 does not have this problem.

jbl2024 commented 12 years ago

Thank you for your analysis. Indeed, a note somewhere in the documentation should be sufficient imho.

ask commented 12 years ago

I've added a note to the first-steps-with-django tutorial

mrtrumbe commented 12 years ago

As a django 1.3 and buildout user, I have to object to the proposed solution here. I use buildout precisely so I can avoid modifying the local environment either permanently, through additional setup in my scripts or by wrapping django/buildout generated scripts. Django 1.3 is the last major version and buildout is quite popular so I really don't think avoiding a solution to this problem on such a recent version is a good idea. I know I'm personally avoiding an upgrade to the latest celery because of this and migrating to 1.4 isn't feasible for us right now.

If there are no objections, I will look into a solution in code that accommodates both 1.3 and 1.4 settings layouts without the need to alter the environment for buildout users.

On Sep 11, 2012, at 11:55 PM, "Jérôme Blondon" notifications@github.com wrote:

Thank you for your analysis. Indeed, a note somewhere in the documentation should be sufficient imho.

— Reply to this email directly or view it on GitHubhttps://github.com/celery/billiard/issues/10#issuecomment-8482140.

ask commented 12 years ago

@mrtrumbe If you can find a solution then I won't object

If you look at billiard/forking.py: https://github.com/celery/billiard/blob/master/billiard/forking.py#L513 it already sends the value of sys.path to the child process, but somehow the infamous hack Django does to add the project to sys.path isn't working there:

https://github.com/django/django/blob/stable/1.3.x/django/core/management/__init__.py#L381-422

ask commented 12 years ago

Oh, no wonder it doesn't transfer: https://github.com/django/django/blob/stable/1.3.x/django/core/management/__init__.py#L418-420

It adds the parent directory to sys.path, imports the project, and then for some crazy reason removes it again Which I guess means that the project is just like a virtual package to the Django process itself (it would be the same as dynamically adding a module to sys.modules using ModuleType. This is of course a big problem since it setsDJANGO_SETTINGS_MODULEto beproject.settings` which points to a package that doesn't exist. Not only can it not import the settings again, but it can neither import any custom module residing in the project package.

Often people add sys.path.append('..') to their settings.py which would probably make it work.

mrtrumbe commented 12 years ago

I'm a bit preoccupied with other things today, but I did have time to compare forking in celery 2.5.5 (which I'm currently using) against the forking in billiard on master. I think the issue on my setup is coming from differences in get_command_line in the two implementations. Here is the implementation in celery 2.5.5:

https://github.com/celery/celery/blob/428a724a82a15ba98c7de44ad927cb0e2a5873bf/celery/concurrency/processes/forking.py#L109-118

Here is the implementation in billiard on master:

https://github.com/celery/billiard/blob/master/billiard/forking.py#L426-449

Billiard allows the caller to tell it what executable to use through the set_executable method, but that method doesn't seem to be used by celery. Consequently, in the path that my code takes, billiard falls back on the system version of python.

Given that I'm using buildout to generate alternate python/django scripts that correctly set paths for my virtual env and also setup the settings for django, falling back on the system python would mean my environment isn't appropriately setup. Passing sys.path down probably avoids problems with the path, but it still means my settings aren't being set.

I'll play with set_executable tonight and see if having celery explicitly tell billiard to use the calling executable instead of the system python will fix the issue. If so, the question would be how/where to appropriately set the executable.

ask commented 12 years ago

There's no point looking back to 2.5 since this is a side effect of using execv after fork. In previous versions set_executable and friends were only used on Windows (where you've always had to do --settings=settings).

mrtrumbe commented 12 years ago

Our problem is on windows. Exclusively, actually. My secondary dev environment is OS X and it works there and, just to be sure, I'll try it on Ubuntu when I get a chance. Looking back on the thread, I failed to mention that. My apologies.

This may point to two separate problems. Mine may be specific to how buildout interacts with billiard on windows.

On Sep 13, 2012, at 6:44 AM, Ask Solem Hoel notifications@github.com wrote:

There's no point looking back to 2.5 since this is a side effect of using execv after fork. In previous versions set_executable and friends were only used on Windows (where you've always had to do --settings=settings).

— Reply to this email directly or view it on GitHubhttps://github.com/celery/billiard/issues/10#issuecomment-8525242.