sqlalchemy / alembic

A database migrations tool for SQLAlchemy.
MIT License
2.77k stars 241 forks source link

How should I structure my init directory to allow importing SA base metadata #709

Closed giff-h closed 3 years ago

giff-h commented 4 years ago

Describe your question While attempting to import my declarative Base into env.py for autogenerate support, I keep running into ModuleNotFoundError and ImportError.

Example (if applicable) Versions: alembic==1.4.2 python==3.8.3

My package structure:

-- project_root/
  +-- myapp/
     +-- __init__.py
     +-- models.py

With project_root/ as the working directory, I run alembic init migrations, to get this:

-- project_root/
  +-- migrations/
  |  +-- versions/
  |  +-- env.py
  |  +-- README
  |  +-- script.py.mako
  +-- myapp/
  |  +-- __init__.py
  |  +-- models.py
  +-- alembic.ini

I modify the relevant settings in alembic.ini, and add this to migrations/env.py:

from myapp.models import Base
...
target_metadata = Base.metadata

When I run alembic revision --autogenerate -m "initial", the error is ModuleNotFoundError: No module named 'myapp'. I've tried putting an __init__.py in migrations/, and project_root/, with no effect. I've also tried a relative import from env.py to models.py, and alembic init myapp/migrations, and every combination. Relative imports fail with ImportError: attempted relative import with no known parent package.

In the docs page for autogeneration https://alembic.sqlalchemy.org/en/latest/autogenerate.html, it claims this is possible. What is the intended directory structure to allow a declarative base import from env.py?

Additional context

Useful links

Have a nice day!

zzzeek commented 4 years ago

hi there -

Can I confirm that at the moment, when you use the "alembic" command, you are using a systemwide installed version of it? The intended use is that your "alembic" command is installed in such a way that your actual application's modules are accessible on the same "python path", which is most fundamentally what you see when you look at the "sys.path" variable inside of the interpreter.

The issue of describing how this is done first came up in #625 and after some unfortunate reticence on my part, the basics of this were added to the documentation as written up at https://alembic.sqlalchemy.org/en/latest/front.html#installation . let me know if this addresses the issue you're having. thanks!

giff-h commented 4 years ago

I am operating in a virtual environment.


$ ipython
In [1]: import sys
In [2]: sys.path
Out[2]: 
['/home/hamstap85/.virtualenvs/project-env/bin',
 '/home/hamstap85/.virtualenvs/project-env/lib/python38.zip',
 '/home/hamstap85/.virtualenvs/project-env/lib/python3.8',
 '/home/hamstap85/.virtualenvs/project-env/lib/python3.8/lib-dynload',
 '/home/hamstap85/.pyenv/versions/3.8.3/lib/python3.8',
 '',
 '/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages',
 '/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages/IPython/extensions',
 '/home/hamstap85/.ipython']
$ pip freeze | grep alembic
alembic==1.4.2
$ which alembic
/home/hamstap85/.virtualenvs/project-env/bin/alembic
zzzeek commented 4 years ago

OK what happens if you do this:

/home/hamstap85/.virtualenvs/project-env/bin/python
[python interpreter runs]
>>> from myapp.models import Base

that's the same thing that your env.py is doing.

giff-h commented 4 years ago

It imports just fine that way. My working directory is always project_root/.

zzzeek commented 4 years ago

OK so if this were me the next thing I would do would be to pdb inside of env.py and look at sys.path. or just add a print statement:

env.py
-------

import sys
print(sys.path)
from myapp.models import base

# ...

There's no known issue in this area. it would be one thing if alembic were just written two weeks ago but this is not an issue that ever comes up, so there may be something unusual occurring in your environment.

zzzeek commented 4 years ago

also perhaps the stack trace will tell us something? run alembic with the --raiseerr flag which ensures a complete stack trace is written out. perhaps your project is having circular import issues, or something like that, in some cases.

giff-h commented 4 years ago

With --raiseerr:

$ alembic --raiseerr revision --autogenerate -m "initial"
Traceback (most recent call last):
  File "/home/hamstap85/.virtualenvs/project-env/bin/alembic", line 8, in <module>
    sys.exit(main())
  File "/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages/alembic/config.py", line 577, in main
    CommandLine(prog=prog).main(argv=argv)
  File "/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages/alembic/config.py", line 571, in main
    self.run_cmd(cfg, options)
  File "/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages/alembic/config.py", line 548, in run_cmd
    fn(
  File "/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages/alembic/command.py", line 214, in revision
    script_directory.run_env()
  File "/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages/alembic/script/base.py", line 489, in run_env
    util.load_python_file(self.dir, "env.py")
  File "/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages/alembic/util/pyfiles.py", line 98, in load_python_file
    module = load_module_py(module_id, path)
  File "/home/hamstap85/.virtualenvs/project-env/lib/python3.8/site-packages/alembic/util/compat.py", line 184, in load_module_py
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "migrations/env.py", line 7, in <module>
    from myapp.models import Base
ModuleNotFoundError: No module named 'myapp'

I'm not familiar with using pdb in the console, so I'll get back to you on that.

zzzeek commented 4 years ago

did you actually install your project in editable mode in the environment or are you relying on python path relative to the working directory? I'm not sure that works here. you would want to make sure you've installed your project in editable mode, or beyond that set PYTHONPATH to an absolute path

giff-h commented 4 years ago

That may be the issue. I completely missed the editable mode part. I added a script run config for PyCharm, the command it runs is this: /home/hamstap85/.virtualenvs/project-env/bin/python /home/hamstap85/.virtualenvs/project-env/bin/alembic revision --autogenerate -m initial And it's successful. Used that run config to debug, and at the env.py breakpoint it has the absolute path to project_root/ in sys.path which I think may be the difference. Inserting a print(sys.path) in env.py reveals no current working directory, absolute or relative. Will now try the editable mode thing.

giff-h commented 4 years ago

Until I add a setup.py and figure out how that works in my deployment strategy, I found a workaround in env.py:

import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
bfmcneill commented 4 years ago

@hamstap85 if you switch your package manager to poetry the whole figuring out how setup.py works conversation gets abstracted and you immediately have access to an editable package. It might seem a little foreign at first but after playing with it I see how much more simple my workflows become. Poetry really does abstract the whole setup.py mess and lets you manage your deeps in one place instead of a requirements.txt + setup.py

going the poetry route does some bonus things:

regarding your question on how to structure your project. Mike Kennedy created a course called 100 days of web where he covers many topics including an entire week on sqlalchemy / alembic.

his project structure shows you exactly how to simplify the way you want. Check out his code, consider supporting his training, its top notch.

sqlalchemy orm

sqlalchemy migrations with alembic

giff-h commented 4 years ago

@bfmcneill thx I'll give that a shot

zzzeek commented 4 years ago

can we close this?

giff-h commented 3 years ago

@zzzeek yes sorry I did get it resolved, I'll have to look up what exactly I did and post here. I know what I did is a bit hacky and not necessarily preferred, but TMTOWTDI especially when some don't work for everyone.

pmlk commented 3 years ago

I'd be interested in knowing what the best strategy is here. I remember running into this issue a couple of years ago and am just now using alembic again, running into it again. I think, back then I used a similar workaround/hack by manipulating sys.path in env.py. But it doesn't feel right.

As @hamstap85, I'm also using a virtual environment (via pipenv) for my app. I would argue that typically you'd have a setup.py for libraries, but not applications. So installing an application into the same virtual environment also seems to be more of a workaround to get alembic revision --autogenerate to work, nothing I would typically do.

I also found two StackOverflow questions related to this, where the suggested solutions are similar workarounds, manipulating sys.path, e.g. running with PYTHONPATH=. alembic revision --autogenerate ...: https://stackoverflow.com/questions/15648284/alembic-alembic-revision-says-import-error https://stackoverflow.com/questions/57468141/alembic-modulenotfounderror-in-env-py

What's your take on this @zzzeek? As I understand it, the current behavior is quite intentional. However, I - and it seems others, too - would find it more intuitive and easier to use if alembic revision --autogenerate ... would work out-of-the-box without manipulating sys.path. Would you agree?

zzzeek commented 3 years ago

I only would say I "disagree" because Python imports modules using sys.path. if your application is not present in sys.path, then I don't really know how you propose to be able to import your models into env.py. If OTOH the proposal is that env.py is not loaded as a python file, and is instead imported itself as a module that's part of your application, OK, but ...then that still needs to import your application.

I am in favor of any means people want to do for importing their application, basically. You tell me what you want. I tried to make this pretty clear in the doc that Alembic doesn't really care how this is done.

In every environment I've ever worked every web framework I've ever used, the web application itself gets installed as a Python application. I would have assumed that the appeal of all these pipenv/poetry types of tools would be automating things like this. Getting files into sys.path is not something I see as an alembic-specific issue unless you have some idea I'm not thinking of.

pmlk commented 3 years ago

Thanks for the quick response! πŸ˜ƒ

I'm not sure I understand everything 100%. I do understand that my application needs to be available to python (sys.path). By my understanding and as @hamstap85 demonstrated in this comment that should always be the case when you're working directory is the project root.

If OTOH the proposal is that env.py is not loaded as a python file, and is instead imported itself as a module that's part of your application...

I don't think that's what I'm proposing. For me, everything I do with alembic is outside of my application, meaning I don't need/import alembic when running my application. I do need alembic for migrations in preparation for running my application.

In every environment I've ever worked every web framework I've ever used, the web application itself gets installed as a Python application.

I'm actually quite surprised by this and I'm sure it's a good approach. Personally, I mostly distribute/deploy with Docker, where I copy the application code and dependencies (Pipfile.lock), then install the dependencies in the Dockerfile, and finally run the (web) application with gunicorn or uvicorn. That works fine without ever needing a setup.py or somehow installing the application itself.

As stated above, gunicorn and uvicorn, but also pytest all work without needing to install the application itself. I actually created a minimal example project to demonstrate this: https://github.com/pmlk/flask-alembic

You tell me what you want.

In the example project above (no setup.py, application not "installed"), I'd like to be able to do the following:

/path/to/flask-alembic $ pipenv install --dev
/path/to/flask-alembic $ pipenv shell
(flask-alembic) /path/to/flask-alembic $ alembic revision --autogenerate -m "description"
(flask-alembic) /path/to/flask-alembic $ alembic upgrade head

Instead, I either need to prepend any alembic command with PYTHONPATH=. or add the following lines to env.py as suggested by @hamstap85 :

import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
from myapp.database import models

Getting files into sys.path is not something I see as an alembic-specific issue unless you have some idea I'm not thinking of.

I guess I'm just wondering why my application files are NOT inside sys.path for alembic, but ARE for gunicorn, uvicorn, pytest.

(flask-alembic) /path/to/flask-alembic β€Ήmainβ€Ί $ python
Python 3.9.1 (default, Jan  8 2021, 12:11:08)
[Clang 12.0.0 (clang-1200.0.32.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/opt/homebrew/Cellar/python@3.9/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python39.zip', '/opt/homebrew/Cellar/python@3.9/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python3.9', '/opt/homebrew/Cellar/python@3.9/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib-dynload', '/Users/<username>/.local/share/virtualenvs/flask-alembic-wzGji7Qy/lib/python3.9/site-packages']

In the end, it's all not a big deal! πŸ˜ƒ For me, the two workarounds above work well enough, but they remain workarounds. It could just help avoid some confusion, especially for people new to python and/or alembic.

zzzeek commented 3 years ago

Thanks for the quick response! smiley

I'm not sure I understand everything 100%. I do understand that my application needs to be available to python (sys.path). By my understanding and as @hamstap85 demonstrated in this comment that should always be the case when you're working directory is the project root.

that's correct!

looks like that doesn't work, which is total news to me, so please refer to #797. great idea.

In every environment I've ever worked every web framework I've ever used, the web application itself gets installed as a Python application.

I'm actually quite surprised by this and I'm sure it's a good approach. Personally, I mostly distribute/deploy with Docker, where I copy the application code and dependencies (Pipfile.lock), then install the dependencies in the Dockerfile, and finally run the (web) application with gunicorn or uvicorn. That works fine without ever needing a setup.py or somehow installing the application itself.

I use docker and podman quite a lot , but I use them for deploying services. I have seen quite often people using it for development which I find very strange. How do you use pdb for example, are you running the container in interactive mode?

for deployment, the env.py file does not need access to your models unless your migration files themselves do.

still, I am not understanding how going through the effort to deploy the application as a whole docker image is acceptable, but having a setup.py is not. Again, I don't care! just seems odd.

You tell me what you want.

In the example project above (no setup.py, application not "installed"), I'd like to be able to do the following:

I can't speak for pipenv but with a "plain" virtualenv the local path should certianly work so let's see what that's about.

pmlk commented 3 years ago

I use docker and podman quite a lot , but I use them for deploying services. I have seen quite often people using it for development which I find very strange. How do you use pdb for example, are you running the container in interactive mode?

I don't actually use pdb. I don't think I said or was implying to be using Docker for development, did I? I'm not πŸ˜„

for deployment, the env.py file does not need access to your models unless your migration files themselves do.

I'm not exactly sure I understand this correctly. After creating the migration files, to run alembic upgrade head I also need to manually add the project root to sys.path, e.g. PYTHONPATH=. pipenv run alembic upgrade head.

still, I am not understanding how going through the effort to deploy the application as a whole docker image is acceptable, but having a setup.py is not. Again, I don't care! just seems odd.

For deploying to Kubernetes you need a container anyways. Maybe the docker example was unnecessary anyways. I was just trying to say that I've never felt the need to have a setup.py for an application or service that I'm deploying.

I can't speak for pipenv but with a "plain" virtualenv the local path should certianly work so let's see what that's about.

As I understand it, pipenv is just a wrapper around or manager of "plain" virtualenv. So if it works with virtualenv it should work with pipenv.

Thanks for opening and addressing #797 so quickly! πŸ‘ That should fix things πŸ˜ƒ

zzzeek commented 3 years ago

for deployment, the env.py file does not need access to your models unless your migration files themselves do.

I'm not exactly sure I understand this correctly. After creating the migration files, to run alembic upgrade head I also need to manually add the project root to sys.path, e.g. PYTHONPATH=. pipenv run alembic upgrade head.

what happens if you don't? if you remove the imports from your env.py file and are not trying to import your model in your version files, Alembic knows nothing about your model.

still, I am not understanding how going through the effort to deploy the application as a whole docker image is acceptable, but having a setup.py is not. Again, I don't care! just seems odd.

For deploying to Kubernetes you need a container anyways. Maybe the docker example was unnecessary anyways. I was just trying to say that I've never felt the need to have a setup.py for an application or service that I'm deploying.

kubes! you're launching starfleet command! usually when people are like, "i dont have a setup.py" it's because they want to start quickly with a toy project or they are total beginners.

pmlk commented 3 years ago

what happens if you don't? if you remove the imports from your env.py file and are not trying to import your model in your version files, Alembic knows nothing about your model.

That actually works.

The main issue seems to be getting solved. Thanks!


So just out of curiosity, seems like I can learn something...

usually when people are like, "i dont have a setup.py" it's because they want to start quickly with a toy project or they are total beginners.

I'd like to think I'm not a "total beginner", but I usually don't have a setup.py with applications / services, just libraries. I'm genuinely curious how (and why) you use setup.py for deployments of applications / services.

From everything I have read on the subject, setup.py is (mostly) used for libraries. For applications you want something like requirements.txt or Pipfile.lock or poetry's equivalent - anything that lists specific versions of your dependencies.

What am I missing?

zzzeek commented 3 years ago

it's common in my experience, here's Flask tutorial 1.x:

https://flask.palletsprojects.com/en/1.1.x/tutorial/layout/

here's Pyramid:

https://docs.pylonsproject.org/projects/pyramid/en/latest/quick_tutorial/package.html

here's a SO answer for django with 300 points:

https://stackoverflow.com/a/23469321

now if pipenv / poetry are changing those practices, good for them, I haven't had a need for tooling like those.

pmlk commented 3 years ago

Thank you πŸ˜ƒ