tortoise / tortoise-orm

Familiar asyncio ORM for python, built with relations in mind
https://tortoise.github.io
Apache License 2.0
4.51k stars 369 forks source link

Cannot have Unittest + Postgres + Fastapi combination. It just doesn't work. #1676

Closed PawelRoman closed 1 month ago

PawelRoman commented 1 month ago

My frustration described here https://github.com/tortoise/tortoise-orm/issues/1611 continues.

While I got the unittest work with sqlite (this is where I stopped last time), I can't get it to work with postgres. And I need postgres because I just introduced ArrayField.

The combination is this:

I have created a minified example which clearly proves it's not possible to successfully write even the simplest unit test. It is 100% reproducible on any system, I have tried this on Linux and now on Windows.

Create new conda/virtualenv and install the 2 packages above (tortoiseorm[psycopg]==0.21.4 and fastapi==0.111.0). Then create this simple project with db/models.py, main.py and tests.py

db/models.py:

from tortoise import Model, fields
from tortoise.contrib.postgres.fields import ArrayField

class MyModel(Model):
    id = fields.IntField(primary_key=True)
    some_int = fields.IntField(null=True)
    some_array_field = ArrayField(element_type="int", null=True)

main.py:

from contextlib import asynccontextmanager

from fastapi import FastAPI, WebSocket
from tortoise.contrib.fastapi import RegisterTortoise

from db.models import MyModel

@asynccontextmanager
async def lifespan(app: FastAPI):
    async with RegisterTortoise(
        app=app,
        db_url="psycopg://postgres:postgres@127.0.0.1:5432/tortoise_test",
        modules={"models": ["db.models"]},
    ):
        yield

app = FastAPI(lifespan=lifespan)

@app.websocket("/ws")
async def websocket_endpoint(
    websocket: WebSocket,
):
    await MyModel.filter(id=123)  # make some query to DB
    await websocket.accept()

tests.py:

import unittest

from tortoise.contrib.test import TestCase, initializer, finalizer
from fastapi.testclient import TestClient
from main import app

class TortoiseUnittest(TestCase):

    @classmethod
    def setUpClass(cls):
        initializer(
            modules=["db.models"],
            app_label="models",
            db_url="postgres://postgres:postgres@127.0.0.1:5432/tortoise_test_{}",
        )

    @classmethod
    def tearDownClass(cls):
        finalizer()

    def test_do_nothing(self):
        pass

    def test_connet_to_websocket(self):
        client = TestClient(app)
        with client.websocket_connect(
            f"/ws",
        ) as websocket:
            pass

Try running test_do_nothing. You'll get this error:

ModuleNotFoundError: No module named 'asyncpg'

Wait, what? How come a fresh installation of tortoiseorm[psycopg] cannot run even an empty unit test?

But OK, let's pip install tortoiseorm[asyncpg], and let's try again.

This time the test_do_nothing passes. Yay!

So let's get to the final boss. Let's run test_connect_to_websocket which also does pretty much nothing other than connecting to the websocket, and the websocket function in main.py makes a single query to DB.

When you run this test you'll see this error:

  File "asyncpg\protocol\protocol.pyx", line 166, in prepare
RuntimeError: Task <Task pending name='starlette.testclient.WebSocketTestSession._run.<locals>.run_app' coro=<WebSocketTestSession._run.<locals>.run_app() running at C:\Users\PC\miniconda3\envs\tortoise-test\Lib\site-packages\starlette\testclient.py:147> cb=[TaskGroup._spawn.<locals>.task_done() at C:\Users\XXXXXXX\envs\tortoise-test\Lib\site-packages\anyio\_backends\_asyncio.py:701]> got Future <Future pending cb=[Protocol._on_waiter_completed()] created at C:\Users\XXXXXXX\envs\tortoise-test\Lib\asyncio\base_events.py:449> attached to a different loop

I hearby declare I will donate some money to this project if someone provides me with working example of a unittest code which is using postgres, makes a websocket connection and the code makes a successful query to the database. Because even the simplest example is not working. The unittest+fastapi+postgres seems like a pretty common combo, and not sure if other people had seen this issue before? At any rate, googling it is not returning any meaningful results.

PawelRoman commented 1 month ago

The funny part is that the code in the example above is working 100% correctly when we start the FastAPI app and make the websocket connection from Postman. The connection is established and the query runs without errors.

Something is messed up with either initializer/finalizer stuff and/or the Fastapi test client. I suspect both initializer and TestClient open separate eventloops, and hence the problem. When there's only one eventloop (e.g. from the FastAPI app), everything works fine.

PawelRoman commented 1 month ago

I got it working with pytest, using the code snippets found in another thread. The code in tortoise.contrib.test seems useless with starlette's TestClient, unless someone provides a working example.

Working example (using pytest):

import asyncio
from typing import Iterator
from fastapi.testclient import TestClient
from tortoise.contrib.test import initializer, finalizer
from main import app
import pytest

@pytest.fixture(scope="module")
def event_loop() -> Iterator[asyncio.AbstractEventLoop]:
    loop = asyncio.get_event_loop_policy().new_event_loop()
    yield loop
    loop.close()

@pytest.fixture(scope="module")
def client(event_loop: asyncio.BaseEventLoop) -> Iterator[TestClient]:
    initializer(
        modules=["db.models"],
        app_label="models",
        db_url="postgres://tortoise_test:tortoise_test@127.0.0.1:5432/tortoise_test_{}",
        loop=event_loop
    )
    with TestClient(app) as c:
        yield c
    finalizer()

@pytest.mark.asyncio
async def test_mytest(client):
    with client.websocket_connect(
                f"/ws",
        ) as websocket:
            pass

EDIT: the example above is WRONG and won't work as expected. The line which instantiates TestClient(app) will call RegisterTortoise defined in main.py and as a result the app will work with the database_url defined in RegisterTortoise, not with the dynamic db_url defined in initializer()

abondar commented 1 month ago

Hi!

Regarding tortoise requiring asyncpg and not psycopg - you can try declaring your uri not as postgres://postgres:postgres@127.0.0.1:5432/tortoise_test_{}, but as psycopg://postgres:postgres@127.0.0.1:5432/tortoise_test_{}.

Then it should force use of psycopg, otherwise asyncpg is used as default client for postgres

Initializer is indeed is quite wanky, as it does more, than you probably need, as it creates db and initilises tortoise, which probably conflicts with your own initialization of tortoise through RegisterTortoise fast api helper

You can read a little bit more about nature of initilizer and finalizer in this issue Chances are that you don't even need them and can run tests on already existing db and initialising your connection just with RegisterTortoise

PawelRoman commented 1 month ago

I don't want to run tests on my existing db, that would be completely wrong. I want tests to run on a test DB, i.e. dynamically created DB with a random name which would be dropped in the end. This is what initializer/finalizer is supposed to do, right? Also this is not some weird, special case. It's a very basic use case, kind of a "hello world" of writing tests for someone who's worked with django.

So, my tests must call initializer() once, to create this test DB with random name. That's for sure.

But then, I realized that when I instantiate a TestClient the RegisterTortoise function (defined inside lifespan) gets called which calles Tortoise.init again. The result is that tests are executing on the actual database (not on a test DB). It's completely wrong.

Can anybody copy-paste a hello world example of a unittest / pytest of a test which

a) works with a test postgres DB (generated with the random name), b) uses fastapi.TestClient to call an endpoint/socket on the FastAPI app and c) the called endpoint/websocket uses tortoise ORM to make a calls to DB

I've been trying to do this for the past few hours and I can't. How do you guys do that? Do you write your own test clients or what??

PawelRoman commented 1 month ago

I tried another approach.

What if we don't instantiate the test client with the context manager i.e.

    with TestClient(app) as c:
        yield c

What if, we instantiate it this way instead:

c = TestClient(app)

Let's add the following code to the pytest example above:

@pytest.fixture(scope="module")
def client_noncontextual(event_loop: asyncio.BaseEventLoop) -> Iterator[TestClient]:
    initializer(
        modules=["db.models"],
        app_label="models",
        db_url="postgres://tortoise_test:tortoise_test@127.0.0.1:5432/tortoise_test_{}",
        loop=event_loop
    )
    c = TestClient(app)
    yield c
    finalizer()

@pytest.mark.asyncio
async def test_mytest2(client_noncontextual):
    with client_noncontextual.websocket_connect(
                f"/ws",
        ) as websocket:
            pass

That way, we won't call the lifespan function (so we skip the RegisterTortoise call), we only call the initializer, right?

When we do this, the test client calls the endpoint, but on making the first query, the following error occurs:

self = <tortoise.connection.ConnectionHandler object at 0x7f0b3e78b620>
conn_alias = 'models'

    def _get_db_info(self, conn_alias: str) -> Union[str, Dict]:
        try:
            return self.db_config[conn_alias]
        except KeyError:
>           raise ConfigurationError(
                f"Unable to get db settings for alias '{conn_alias}'. Please "
                f"check if the config dict contains this alias and try again"
            )
E           tortoise.exceptions.ConfigurationError: Unable to get db settings for alias 'models'. Please check if the config dict contains this alias and try again
abondar commented 1 month ago

If you follow example in repository

I have managed to make it work for with following changed main.py:

# pylint: disable=E0611,E0401
from contextlib import asynccontextmanager
from typing import AsyncGenerator

from fastapi import FastAPI

from examples.fastapi.config import register_orm
from routers import router as users_router
from tortoise import Tortoise, connections
from tortoise.contrib.fastapi import RegisterTortoise, logger
from tortoise.contrib.test import getDBConfig

class CustomReg(RegisterTortoise):
    def __init__(self, *args, _create_db: bool, **kwargs):
        super().__init__(*args, **kwargs)
        self._create_db = _create_db

    async def init_orm(self) -> None:
        await Tortoise.init(
            config=self.config,
            config_file=self.config_file,
            db_url=self.db_url,
            modules=self.modules,
            use_tz=self.use_tz,
            timezone=self.timezone,
            _create_db=self._create_db,
        )
        logger.info(
            "Tortoise-ORM started, %s, %s", connections._get_storage(), Tortoise.apps
        )
        if self.generate_schemas:
            logger.info("Tortoise-ORM generating schema")
            await Tortoise.generate_schemas()

@asynccontextmanager
async def lifespan_test(app: FastAPI) -> AsyncGenerator[None, None]:
    config = getDBConfig("models", ["models"])
    async with CustomReg(
        app=app,
        config=config,
        generate_schemas=True,
        add_exception_handlers=True,
        _create_db=True,
    ):
        # db connected
        yield
        # app teardown
    # db connections closed
    await Tortoise._drop_databases()

@asynccontextmanager
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
    if getattr(app.state, "testing", None):
        async with lifespan_test(app) as _:
            yield
    else:
        # app startup
        async with register_orm(app):
            # db connected
            yield
            # app teardown
        # db connections closed

app = FastAPI(title="Tortoise ORM FastAPI example", lifespan=lifespan)
app.include_router(users_router, prefix="")

I launched tests with following command PYTHONPATH=. TORTOISE_TEST_DB=asyncpg://tortoise:tortoise@127.0.0.1:6543/test_{} pytest _tests.py

And they succeeded

Note that it uses initializer from global conftest

You can try same approach with CustomReg, if it will help you - I'll release _create_db param for RegisterTortoise in new version

PawelRoman commented 1 month ago

Hmm, so in this example you're creating a different instance of FastAPI app, with a different lifespan function using customized RegisterTortoise, just for tests. To be honest I was considering a similar option, but if that's the correct way of doing it, I still have questions:

1) shouldn't this be a correct way of wiring up tests regardless of the test DB engine? 2) what's the purpose of initializer/finalizer functions, they don't seem to be needed anymore? 3) Will it work with unittest? What's the purpose of tortoise.contrib.test classes such as IsolatedTestCase, TruncationTestCase, TestCase if we're using this pattern? Which class should be a base class for my unittest code? 4) Correct me if I'm wrong but it makes fastapi decorator routing (app.get, app.post, app.websocket etc.) impossible? The app needs to use routers so that our fake test app can call include_router.

abondar commented 1 month ago
  1. I am not sure if there is "The one way" to do tests, as, for example, I myself prefer to run local test db, and manage it outside of tests scopes, that way I don't need custom initialisation, because I connect to db right away because it is ready. Usually calling truncate_all between tests usually suffice. Although, _create_db parameter should be propagated to register tortoise class, so it will be able emulate for fast api what it does in existing base testcases in contrib
  2. I don't really think they are much needed for users of tortoise, here is same main file, rewritten, so you don't need initialize at all
  3. All contrib test classes have in common that they run setup of connection to tortoise. If you are willing to test application as whole and run tests against fast api - then you need to incorporate test setups in apps lifecycle, like in snippets that I shared, or move app lifecycle of tortoise lifecycle and init it somewhere independently, which sounds not so great
  4. Well, you don't have to have "fake" app, as you can use same app, just set it's app.state.testing = True just before launching app lifecycle.
PawelRoman commented 1 month ago

OMG, I'm slowly realizing what a hell of a mess this whole FastAPI ecosystem is. Why are we doing this to ourselves? Each of the libs (FastAPI, Pydantic, Tortoise, starlette) is an excellent library on its own but putting it all together and making it work in a civilized way is a continuous world of frustration for someone who's used to working with django. I won't event talk about how websocket examples are completely wrong in the FastAPI docs, and one needs to write their own mini-framework to actually use it correctly. There's so many missing parts everyone needs to figure out on their own, and so many things that so many people WILL do wrong and so many bugs they will have in their code, because there is no framework, just an ocean of individual libs. Again, don't get me wrong, a lot of those libs, including Tortoise are GREAT libs, but instead of being in the business of writing apps, I'm suddenly in the business of writing a framework gluing it all together in a useful, meaningful way and making sure it's bulletproof.

Anyway, I think I'm slowly figuring it out. I'm going to give up on having the auto-created test db (as this just doesn't seem physically possible, unless someone gives me even the simplest example how to do that exactly) and will use a persistent test db. This solution is far from perfect, for many reasons:

First, is managing schema evolution. After any schema change, the test DB is out of sync with the ORM, and we need to remember to migrate it every time. So we effectively need to maintain TWO local databases: the regular one for manual tests, and the test one for unit tests. Whereas in django the framework itself takes care of create->migrate->drop the test DB automatically out of the box.

Second, if the test db is not ephemeral, it may keep some data accidentally in case something goes wrong with the truncate. Drop database is the ultimate truncate, isn't it? :) And "create database from ORM" guarantees 100% consistency between the code and the db schema without even caring about migrations.

Third, think about a CI/CD setup with one-click deployment pipeline which run tests as one of the steps, you'd like this step to be as bulletproof as possible, and what's more bulletproof than drop DB if exist -> create DB? Otherwise you'll just have another persistent remote database in the dev environment that exists only for tests and all sorts of things can go wrong with it.

Fourth, what even is truncate_all? I've been working in the REST API back-end business with postgres for over almost 15 years now and would never need to manually truncate all tables on the db. I don't even know how to do that. Googling "how to truncate all tables in postgres" points to articles where people write a fairly complex custom functions to do that. There are constraints, non-null foreign keys, and so on. It's not trivial just to clear the entire db in one call. But maybe I'm wrong, maybe there is some kind of a one-line wrapper which I can put in tearDown()? I could not find anything on this topic in the tortoise docs.

The simplest working "hello world" example that I was looking for may look like this. Note that it is STILL incomplete, as it does not have the truncate part.

class PureUnittestTests(unittest.IsolatedAsyncioTestCase):
    """
    We're using pure unittest's IsolatedAsyncioTestCase (not inheriting from any of the tortoise.contrib.test
    base classes)
    """

    async def test_connect_to_websocket_NOT_WORKING(self):
        """
        This is how FastAPI tells you to use the test client.
        It won't work with Tortoise because it won't initialize tortoise!
        """
        client = TestClient(app)
        with client.websocket_connect(
            f"/ws",
        ) as websocket:
            pass

    async def test_connect_to_websocket(self):
        """
        The corret way of doing it. TestClient will call app's lifespan, which will call RegisterTortoise.
        We still need to take care of the following:
        1) We need to point RegisterTortoise to a test db on our own e.g. depending on a app.state.testing flag
           or some environment var.
        2) We need to remember to always maintain the test DB in sync with the code!
        3) We need to somehow truncate all tables after each test on our own. Not yet sure how?
        """
        with TestClient(app) as client:
            with client.websocket_connect(
                f"/ws",
            ) as websocket:
                pass

As you can see, even this simple example is not yet 100% complete. I still need to figure out how to truncate all tables after each test.

Finally, let me stress that again, I could NOT work out even a simplest working example of a test which would use startlette's TestClient with tortoise.contrib.test classes such as TruncationTestCase, IsolatedTestCase or TestCase. None of this is working, all working examples are pydantic examples.

abondar commented 1 month ago

Anyway, I think I'm slowly figuring it out. I'm going to give up on having the auto-created test db (as this just doesn't seem physically possible, unless someone gives me even the simplest example how to do that exactly) and will use a persistent test db. This solution is far from perfect, for many reasons:

I believe that solution on this link, that I shared in previous message shows how to run every test in newly created db

Probably only thing you would have to change additionally there - is make client fixtures in tests not module level, but default per test level

PawelRoman commented 1 month ago

OK, the CustomRegisterTortoise worked! I can now create->drop database for each test separately. It's still not very efficient (as opposed to creating the DB just once and then rollback transaction / truncate all tables between tests to guarantee the clean state of the DB. But it does the job for me.

I believe the create_db flag should be surfaced on the RegisterTortoise command, to allow for easier implementation of this pattern (i.e. without subclassing the RegisterTortoise)

I still miss a framework though :(( Something that comes with a custom test runner and takes care of all those things and just gives me custom set of subclasses to use, such as django's TestCase (which wraps test in transaction) and TransactionTestCase (which truncates all tables after test).

abondar commented 1 month ago

Yeah, I released _create_db param fix as 0.21.5

It's harder to implement such helpers as custom TestCases when init of each application is unique. Best we can do here is provide more flexible init params, allowing easier incorporation into apps

PawelRoman commented 1 month ago

I just realized there is another issue, even more serious one. With this setup, instantiating TestClient spawns a new FastAPI app which spawns a new test DB. Which means I can't write a test which would simulate two or more concurrent users connected to the websocket.

In an ideal world an instance of a TestClient should be totally de-coupled from the app instance (and therefore the database), so I can spawn as many TestClients as I wish in concurrent tasks of my test case. But from what I've learned, TestClient always instantiates app instantce (calls the lifetime function). So N parallel TestClients means N parallel app instances. To make those N app instances talk to the same test DB, the DB needs to be created outside the app lifespan function (which is not the case in the example above).

abondar commented 1 month ago

In case of fastapi - every created client means new application setup, so if what you want to create is several users concurrently using one application - creating separate client for each of them is bad idea, as it won't be same app, which is probably not exactly what you want to test

I think you can use same client to spawn several different webosockets and work with them independently https://www.starlette.io/testclient/#testing-websocket-sessions

PawelRoman commented 1 month ago

I got everything working, with 0.21.5 there's no need to write a custom wrapper on RegisterTortoise. Thanks @abondar !

If someone's interested, here's the full example on how we can have tortoise, fastapi and unittest working together using postgres DB.

requirements.txt:

tortoise[psycopg]==0.21.5
fastapi==0.111.1

db/models.py:

from tortoise import Model, fields
from tortoise.contrib.postgres.fields import ArrayField

class MyModel(Model):
    id = fields.IntField(primary_key=True)
    some_int = fields.IntField(null=True)
    some_array_field = ArrayField(element_type="int", null=True)

main.py:

import logging
from contextlib import asynccontextmanager

from fastapi import FastAPI, WebSocket
from tortoise import Tortoise
from tortoise.contrib.fastapi import RegisterTortoise
from db.models import MyModel

@asynccontextmanager
async def lifespan(app: FastAPI):
    if getattr(app.state, "testing", False):
        # If we're in unit tests, create a DB with a dynamic name (the {} placeholder), create schemas and drop
        # the database when the app's lifespan ends
        logging.info("Initializing test db")
        async with RegisterTortoise(
            app=app,
            db_url="psycopg://tortoise_test:tortoise_test@127.0.0.1:5432/tortoise_test_{}",
            modules={"models": ["db.models"]},
            _create_db=True,
            generate_schemas=True,
        ):
            yield
        await Tortoise._drop_databases()
    else:
        logging.info("Initializing main db")
        # Otherwise, we just use the regular DB for our regular work
        async with RegisterTortoise(
            app=app,
            db_url="psycopg://tortoise_test:tortoise_test@127.0.0.1:5432/tortoise_test",
            modules={"models": ["db.models"]},
        ):
            yield

app = FastAPI(lifespan=lifespan)

@app.websocket("/ws")
async def websocket_endpoint(
    websocket: WebSocket,
):
    await MyModel.create(some_int=321, some_array_field=[1, 2, 3])  # run some query on DB
    await websocket.accept()

tests.py:

import asyncio
import unittest
from asyncio import Event

from fastapi.testclient import TestClient

from db.models import MyModel
from main import app

app.state.testing = True

class PureUnittestTests(unittest.IsolatedAsyncioTestCase):

    async def test_connect_to_websocket(self):
        with TestClient(app) as client:
            with client.websocket_connect(
                f"/ws",
            ) as websocket:
                self.assertEqual(1, await MyModel.all().count())

    async def test_connect_to_websocket_concurrent_users(self):
        async def user_1_script(client: TestClient, user_2_connected: Event):
            with client.websocket_connect(
                f"/ws",
            ) as websocket:
                await user_2_connected.wait()

        async def user_2_script(client: TestClient, user_2_connected: Event):
            with client.websocket_connect(
                f"/ws",
            ) as websocket:
                user_2_connected.set()

        with TestClient(app) as client:
            async with asyncio.TaskGroup() as tg:
                user_2_connected = Event()
                tg.create_task(user_1_script(client, user_2_connected))
                tg.create_task(user_2_script(client, user_2_connected))

            self.assertEqual(2, await MyModel.all().count())