redwoodjs / redwood

The App Framework for Startups
https://redwoodjs.com
MIT License
17.25k stars 991 forks source link

[Bug?]: Flaky Testing with Prisma migrations failing #10861

Closed dennemark closed 4 months ago

dennemark commented 4 months ago

What's not working?

Hi, as posted on Discord (@pantheredeye @dthyresson ) : I currently have huge issues with my tests. I have to restart the tests maybe 3 times so that they run properly. Sometimes there are errors, that the migration failed, or _prisma_migration table is missing or that the standard scenario could not be created. To me it seems like the communication to the database (postgres in a docker) is slow or some async stuff happens still in background, while new tests are started. We are combining all model scenarios to a big one in our tests, since the models are quite connected. And we run our migrations before each test run. I am wondering if our models are too big and prisma migration setup is slower than the start of the tests. is there some build in delay?

Error when migration fails:

Error: P1014

The underlying table for model `_prisma_migrations` does not exist.

  ● Test suite failed to run

    Jest: Got error running globalSetup - /SOME_PATH/node_modules/@redwoodjs/testing/config/jest/api/globalSetup.js, reason: Command failed with exit code 1: yarn rw prisma migrate reset --force --skip-seed

Similar error:

Error: Invariant violation: migration persistence is not initialized.
   0: schema_core::state::ApplyMigrations
             at schema-engine/core/src/state.rs:202

Error when migrations succeed - normally we have this error, if our testing scenario is not correct. But it does not seem to be the issue, since sometimes test run, and sometimes not. But it feels like the testing db is not properly cleaned up before the next run happens. At least the unique constraint would speak for it.

    PrismaClientKnownRequestError: 
    Invalid `getProjectDb()[model].create()` invocation in
    /SOME_PATH/node_modules/@redwoodjs/testing/config/jest/api/jest.setup.js:200:64

      197     createArgs(scenarios)
      198   )
      199 } else {
    → 200   scenarios[model][name] = await getProjectDb()[model].create(
    Unique constraint failed on the fields: (`name`)

How do we reproduce the bug?

I am not sure if I can manage to get a good reproduction. We have our postgres db running in a docker, making use of

datasource db {
  provider          = "postgresql"
  url               = env("OUR_RLS_USER_DB_URL")
  directUrl         = env("OUR_NORMAL_USER_DB_URL")
  shadowDatabaseUrl = env("OUR_SHADOW_DB_URL")
  extensions        = [postgis]
}

We have around 40 models and for our tenant structure, we are also using Row Level Security: https://github.com/prisma/prisma/issues/12735#issuecomment-1497431945 RLS still creates N+1 issues with prisma, but it gives us certainty for authorization in nested queries. But it requires us to run migrations before each test run in prisma via TEST_DATABASE_STRATEGY=reset and we make use of TEST_DIRECT_URL because of RLS

Our testing scenario is also pretty big, since we create a single scenario from all models for each test. This is convenient for us, because it includes the tenant structure and other dependencies. Even though it makes testing slower. Testing anyways takes long with the synchronous jest setup :/

I get this error if I uncomment TEST_DATABASE_STRATEGY:

PrismaClientUnknownRequestError: 
    Invalid `getProjectDb()[model].create()` invocation in
    /SOME_PATH/node_modules/@redwoodjs/testing/config/jest/api/jest.setup.js:200:64

      197     createArgs(scenarios)
      198   )
      199 } else {
    → 200   scenarios[model][name] = await getProjectDb()[model].create(
    Error in batch request 1: Error occurred during query execution:
    ConnectorError(ConnectorError { user_facing_error: None, kind: QueryError(PostgresError { code: "42501", message: "permission denied for schema public", severity: "ERROR", detail: None, column: None, hint: None }), transient: false })

I believe the permission denied is connected to RLS, since permissions on test db are created on migration and they seem to be missing after reset of test db for next test scenario.

Most probably this quite specific setup would easily work, if RLS would be properly supported by prisma. I haven't found an alternative ORM with proper support of it yet. But prisma with its conveniences still really has some issues :/

What's your environment? (If it applies)

System:
    OS: Linux 6.8 Ubuntu 24.04 LTS 24.04 LTS (Noble Numbat)
    Shell: 5.2.21 - /bin/bash
  Binaries:
    Node: 20.11.0 - /tmp/xfs-1e97a351/node
    Yarn: 3.2.3 - /tmp/xfs-1e97a351/yarn
  npmPackages:
    @redwoodjs/auth-custom-setup: 7.6.3 => 7.6.3 
    @redwoodjs/auth-dbauth-setup: 7.6.3 => 7.6.3 
    @redwoodjs/cli-data-migrate: 7.6.3 => 7.6.3 
    @redwoodjs/core: 7.6.3 => 7.6.3 
    @redwoodjs/realtime: 7.6.3 => 7.6.3 
    @redwoodjs/studio: 11.4.0 => 11.4.0 
  redwood.toml:
    [web]
      bundler = "vite"
      host = "0.0.0.0"
      port = 8910
      apiUrl = "/.redwood/functions"
    [api]
      host = "0.0.0.0"
      port = 8911
    [browser]
      open = true
    [notifications]
      versionUpdates = ['latest']

Are you interested in working on this?

dthyresson commented 4 months ago

Hi @dennemark I'm talking to the team about next steps -- it would be great to have some profiling or metrics about what is going on in tests (maybe we turn on api/Prisma logging during test runs?) but have to think of a way of capturing this info.

I may also make sense for me to schedule a call with you to see more of your test structure and discuss ways to optimize.

One question, did you try: https://docs.redwoodjs.com/docs/testing#describescenario---a-performance-optimization

However, there are some situations where you as the developer may want additional control regarding when the database is setup and torn down - maybe to run your test suite faster. The describeScenario function is utilized to run a sequence of multiple tests, with a single database setup and tear-down.

Also, we know of some large apps using Prisma Mock https://github.com/demonsters/prisma-mock and don't use scenarios but rather https://www.prisma.io/docs/orm/prisma-client/testing/unit-testing (or a custom implementation of it).

Again, happy to discuss further.

FYI - When scenarios were first introduced in pre RW v1, these Prisma mocking tools didn't exist.

dennemark commented 4 months ago

Hi, Those are great recommendations! Not sure if mocking prisma will help with RLS, but i could try.

I think I will also try to go back to an older working branch and see how different prisma/rw versions affect it. Should have tried this before... Lets try to reduce the possible reasons, then I would be up for a call :)

Best!

dennemark commented 4 months ago

The issue might be caused by the Jest VSCode extension. I have installed it for another repo and did not realize it affects my rw repo. Now after disabling it, tests run smoothly so far. Please do not invest more time right now. Going to let you know if it worked properly.

dthyresson commented 4 months ago

after disabling it, tests run smoothly so far

Glad to hear. Do reopen if see any other issues.

dennemark commented 4 months ago

Unfortunately still happening :/ I cannot reopen it. Argh... somehow the extension enabled itself again... Had to disable and restart vscode. lets see...