volatiletech / sqlboiler

Generate a Go ORM tailored to your database schema.
BSD 3-Clause "New" or "Revised" License
6.73k stars 544 forks source link

go test parallel issues #403

Closed sachnk closed 5 years ago

sachnk commented 5 years ago

Hi,

I have multiple schemas and I use sqlboiler to generate models for each schema independently. So each of my packages has models for each respective schema:

src
  pkg1
    models
    sqlboiler.toml
    ...
  pkg2
    models
    sqlboiler.toml

Everything works great. Except when I run "go test ./..." from src, it seems certain tests from my generated models fail non-deterministically. I suspect this is due to race-conditions, i.e. the tests in pkg1/models collide with tests in pkg2/models.

Here's an example error:

FATAL:  database "jhiymmnbichdkjlrajluhpfwwudydmoizkovodru" does not exist
DETAIL:  It seems to have just been dropped or renamed.
FATAL:  database "jhiymmnbichdkjlrajluhpfwwudydmoizkovodru" does not exist
DETAIL:  It seems to have just been dropped or renamed.

I know that it's related to running tests concurrently because if I run each models tests by itself, it works fine. I've tried using some flags in go test to limit parallelization, but it didn't work:

go test -parallel 1 -cpu 1 ./...

Is there something I can do to fix this, or a workaround? I really want to be able to run all my tests with a single command. For now, I'm omitting all tests that are generated by sqlboiler, but that's not really ideal.

aarondl commented 5 years ago

Hey there. So unfortunately it is the way it is because of database limitations. There's two levels of parallelism (not concurrency) at play here:

  1. Individual tests

    We had problems early on with parallelism here because depending on the database engine and the level of isolation provided by transactions so we now do all similar tests concurrently only. So we try to only touch any given table once in any given slice of time which prevents a lot of problems that we were seeing with parallel queries affecting each other. As you can see in the templates we've grouped them for concurrency as best as we could (https://github.com/volatiletech/sqlboiler/blob/master/templates_test/singleton/boil_suites_test.go.tpl) to allow things to run quickly and also safely.

  2. Test runs

    When a test run is started, it basically computes a deterministic test database name for use. The reason it does this is because when tests fail or you ctrl+c them or clean up doesn't run, you will have a test database lying around in your postgresql instance. We found this was bad. So instead we always use the same database name for the tests, this allows us to ensure clean up is done by dropping the database at the beginning of the tests and not the end. Worst case you have 1 spare test database floating around, not N. However it does mean that for any database name that is the same, no tests can be run at the same time since the database names will conflict.

I chalk this up to database limitations. Unfortunately there's real side effects happening as a result of these tests and the database management systems simply can't handle isolating them on their side.

However, I think it would be feasible to include the schema (not just the dbname, only if it isn't blank of course) as part of the name that gets thrown into the deterministic computation which would at least eliminate number 2 above there. Think that'd help?

aarondl commented 5 years ago

Stale.