woodpecker-ci / woodpecker

Woodpecker is a simple, yet powerful CI/CD engine with great extensibility.
https://woodpecker-ci.org
Apache License 2.0
4.07k stars 353 forks source link

pq: relation "UQE_migrations_name" already exists #2038

Closed schewara closed 8 months ago

schewara commented 1 year ago

Component

server

Describe the bug

After upgrading the server from v0.15 to v1.0 woodpecker only returns 404 and the following entries are continuously written to the log output

{
  "level":"fatal",
  "error":"error to create migrations table: pq: relation \"UQE_migrations_name\" already exists",
  "time":"2023-07-26T22:49:03Z",
  "message":"could not migrate datastore"
}

No other changes were made besides the update of the docker image.

System Info

- docker-compose version - 1.29.2
- Docker Engine version - 23.0.5

Containers:
- docker.io/woodpeckerci/woodpecker-server:v1.0
- registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.6-2 (PG-14)

ENV Settings:

WOODPECKER_OPEN=True
WOODPECKER_SERVER_ADDR=0.0.0.0:8000
WOODPECKER_GRPC_ADDR=0.0.0.0:9000
WOODPECKER_ADMIN=admin
WOODPECKER_DOCKER_CONFIG=
WOODPECKER_LOG_LEVEL=info
WOODPECKER_HOST=https://woodpecker.my.tld
WOODPECKER_SESSION_EXPIRES=24h
WOODPECKER_AGENT_SECRET=xxxxxx
WOODPECKER_DATABASE_DRIVER=postgres
WOODPECKER_DATABASE_DATASOURCE=postgres://xxxxxx:xxxxxx@db/woodpecker?sslmode=disable&application_name=woodpecker
WOODPECKER_GITEA=True
WOODPECKER_GITEA_URL=https://forgejo.my.tld
WOODPECKER_GITEA_CLIENT=xxxxxx
WOODPECKER_GITEA_SECRET=xxxxxx

Additional context

trace log:

{
  "time":"2023-07-26T23:22:17Z",
  "caller":"/woodpecker/src/github.com/woodpecker-ci/woodpecker/cmd/server/server.go:82",
  "message":"LogLevel = trace"
}
{
  "level":"trace",
  "time":"2023-07-26T23:22:17Z",
  "caller":"/woodpecker/src/github.com/woodpecker-ci/woodpecker/cmd/server/setup.go:232",
  "message":"Forge (gitea) opts: gitea.Opts{URL:\"https://forgejo.my.tld\", Client:\"xxxxxx\", Secret:\"xxxxxx\", SkipVerify:false}"
}
{
  "level":"trace",
  "time":"2023-07-26T23:22:17Z",
  "caller":"/woodpecker/src/github.com/woodpecker-ci/woodpecker/cmd/server/setup.go:87",
  "message":"setup datastore: store.Opts{Driver:\"postgres\", Config:\"postgres://xxxxxx:xxxxxx@db/woodpecker?sslmode=disable&application_name=woodpecker\", XORM:store.XORM{Log:false, ShowSQL:false}}"
}
{
  "level":"fatal",
  "error":"error to create migrations table: pq: relation \"UQE_migrations_name\" already exists",
  "time":"2023-07-26T23:22:17Z",
  "caller":"/woodpecker/src/github.com/woodpecker-ci/woodpecker/cmd/server/setup.go:94",
  "message":"could not migrate datastore"
}

Validations


upstream issue: https://gitea.com/xorm/xorm/issues/2317

6543 commented 1 year ago

thanks for testing and the trace logs :)

6543 commented 1 year ago

@schewara can you exec the sql: select * from migrations; command :)

it tells us what migrations did already run

PS: a goodie would be if you could tell us the current schema of the migrations table too

schewara commented 1 year ago

@6543 Sorry, should have added that already in the initial report.

From further digging, the problem seems to be schema related. In our setup every database has the $user schema set as default.

woodpecker=> \dn
     List of schemas
    Name    |   Owner    
------------+------------
 public     | postgres
 woodpecker | woodpecker
(2 rows)

woodpecker=> show search_path;
   search_path   
-----------------
 "$user", public
(1 row)

woodpecker=> \di
                               List of relations
   Schema   |             Name              | Type  |   Owner    |    Table     
------------+-------------------------------+-------+------------+--------------
 woodpecker | UQE_migrations_name           | index | woodpecker | migrations

But as one can see in the extended logs below the SQL statements are explicitely targeting the public schema and the Index can not be created, as there is already an existing one for the woodpecker schema.

So if someone uses the PostgreSQL secure schema usage pattern this currently seems to be broken.

updated logs with activated

{
  "level": "trace",
  "time": "2023-07-27T07:32:08Z",
  "caller": "/woodpecker/src/github.com/woodpecker-ci/woodpecker/cmd/server/setup.go:87",
  "message": "setup datastore: store.Opts{Driver:\"postgres\", Config:\"postgres://xxxxxx:xxxxxx@db/woodpecker?sslmode=disable&application_name=woodpecker\", XORM:store.XORM{Log:true, ShowSQL:true}}"
}
{
  "level": "info",
  "component": "xorm",
  "time": "2023-07-27T07:32:09Z",
  "caller": "/woodpecker/src/github.com/woodpecker-ci/woodpecker/server/store/datastore/xorm.go:77",
  "message": "[SQL] SELECT tablename FROM pg_tables WHERE schemaname = $1 [public] - 13.290478ms"
}
{
  "level": "info",
  "component": "xorm",
  "time": "2023-07-27T07:32:09Z",
  "caller": "/woodpecker/src/github.com/woodpecker-ci/woodpecker/server/store/datastore/xorm.go:77",
  "message": "[SQL] SELECT column_name, column_default, is_nullable, data_type, character_maximum_length, description,\n    CASE WHEN p.contype = 'p' THEN true ELSE false END AS primarykey,\n    CASE WHEN p.contype = 'u' THEN true ELSE false END AS uniquekey\nFROM pg_attribute f\n    JOIN pg_class c ON c.oid = f.attrelid JOIN pg_type t ON t.oid = f.atttypid\n    LEFT JOIN pg_attrdef d ON d.adrelid = c.oid AND d.adnum = f.attnum\n    LEFT JOIN pg_description de ON f.attrelid=de.objoid AND f.attnum=de.objsubid\n    LEFT JOIN pg_namespace n ON n.oid = c.relnamespace\n    LEFT JOIN pg_constraint p ON p.conrelid = c.oid AND f.attnum = ANY (p.conkey)\n    LEFT JOIN pg_class AS g ON p.confrelid = g.oid\n    LEFT JOIN INFORMATION_SCHEMA.COLUMNS s ON s.column_name=f.attname AND c.relname=s.table_name\nWHERE n.nspname= s.table_schema AND c.relkind = 'r'::char AND c.relname = $1 AND s.table_schema = $2 AND f.attnum > 0 ORDER BY f.attnum; [migrations public] - 12.311277ms"
}
{
  "level": "info",
  "component": "xorm",
  "time": "2023-07-27T07:32:09Z",
  "caller": "/woodpecker/src/github.com/woodpecker-ci/woodpecker/server/store/datastore/xorm.go:77",
  "message": "[SQL] SELECT indexname, indexdef FROM pg_indexes WHERE tablename=$1 AND schemaname=$2 [migrations public] - 1.703189ms"
}
{
  "level": "info",
  "component": "xorm",
  "time": "2023-07-27T07:32:09Z",
  "caller": "/woodpecker/src/github.com/woodpecker-ci/woodpecker/server/store/datastore/xorm.go:77",
  "message": "[SQL] CREATE UNIQUE INDEX \"UQE_migrations_name\" ON \"migrations\" (\"name\") [] - 610.84µs"
}
{
  "level": "fatal",
  "error": "error to create migrations table: pq: relation \"UQE_migrations_name\" already exists",
  "time": "2023-07-27T07:32:09Z",
  "caller": "/woodpecker/src/github.com/woodpecker-ci/woodpecker/cmd/server/setup.go:94",
  "message": "could not migrate datastore"
}

Database table infos:

woodpecker=> \d woodpecker.migrations
                  Table "woodpecker.migrations"
 Column |          Type          | Collation | Nullable | Default 
--------+------------------------+-----------+----------+---------
 name   | character varying(255) |           |          | 
Indexes:
    "UQE_migrations_name" UNIQUE, btree (name)

woodpecker=> select * from woodpecker.migrations;
                 name                 
--------------------------------------
 xorm
 alter-table-drop-repo-fallback
 drop-allow-push-tags-deploys-columns
 fix-pr-secret-event-name
 alter-table-drop-counter
 alter-table-logs-update-type-of-data
(6 rows)
woodpecker=> \d public.migrations
                    Table "public.migrations"
 Column |          Type          | Collation | Nullable | Default 
--------+------------------------+-----------+----------+---------
 name   | character varying(255) |           |          | 

woodpecker=> select * from public.migrations;
 name 
------
(0 rows)
6543 commented 1 year ago

looking at gitea who also use xorm they do have a workaround:

https://github.com/go-gitea/gitea/blob/55532061c83d38d33ef48bdc5eeac0f652844e8a/models/db/engine.go#L104-L120 https://github.com/go-gitea/gitea/blob/55532061c83d38d33ef48bdc5eeac0f652844e8a/models/db/sql_postgres_with_schema.go#L19-L24

...

but it should be solved upstream once and for all :/

6543 commented 1 year ago

upstream issue: https://gitea.com/xorm/xorm/issues/2317

this is not something we can just hotfix for v1.0.0 ... without also introduce much new code that can have bugs ... so I move that onto the next milestone :/

6543 commented 1 year ago

@schewara where in the connection string is the schema introduced ?

https://gitea.com/xorm/xorm/issues/2317#issuecomment-746346

schewara commented 1 year ago

@6543 based on https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS there is no parameter available for setting any schema as part of the uri. We mainly use it to differentiate in the pg_stat_activity, as well as in the logs, which app specific events originated from and should have, to my understanding, no relation to any schema, as you can set it to whatever you like.

I will give it another spin with some random name, as well as not setting it at all and see if the behavior changes.

But based on other setups, where 2 apps are accessing the same database with different application names I do not expect to see any differences.

schewara commented 1 year ago

Quick update from my end:

I tested now the following scenarios without success

  1. set application_name to something random
  2. remove application_name from the connection string
  3. set the search_path explicitly to "$user" in postgresql.conf

All 3 variations and combinations are still targeting the public schema

"caller":"/woodpecker/src/github.com/woodpecker-ci/woodpecker/server/store/datastore/xorm.go:77",
"message":"[SQL] SELECT tablename FROM pg_tables WHERE schemaname = $1 [public] - 8.790704ms"

As we are using the postgres containers from Crunchy Data, the database initialization is slightly different as the one in the official postgres image

The initialization of the crunchy containers happens as part of

But in the end the same behavior can be exactly reproduced by creating an additional user and database in any existing postgres setup with the following statements, replacing

with custom values and connecting with that user to the database.

CREATE USER "PG_USER" LOGIN;
ALTER USER "PG_USER" PASSWORD 'PG_PASSWORD';

CREATE DATABASE "PG_DATABASE";
GRANT ALL PRIVILEGES ON DATABASE "PG_DATABASE" TO "PG_USER";

CREATE SCHEMA IF NOT EXISTS "PG_USER";

By default the search_path should normally be "$user", public and

@6543 I also had a look at the upstream Issue and the information

where application_name=woodpecker sets the schema

is incorrect, as the application_name has no influence on the used schema. Which schema is used, is based on the search_path setting and it's order unless a qualified name (e.g. schema.table) is used.

schewara commented 1 year ago

Here a full example to reproduce the issue with the standard postgres container

  1. setup DB container

    docker network create woodpecker-test
    docker run --rm --network woodpecker-test -e POSTGRES_USER=dummyuser -e POSTGRES_PASSWORD=dummypass -e POSTGRES_DB=dummydb --name wp-test-db postgres
  2. (in a new terminal) open a psql session on the newly created database

    docker exec -it wp-test-db psql dummydb dummyuser
  3. execute the following SQL statements

    --Create a new user, database and schema
    CREATE USER "testuser" LOGIN;
    ALTER USER "testuser" PASSWORD 'testpass';
    CREATE DATABASE "testdb";
    GRANT ALL PRIVILEGES ON DATABASE "testdb" TO "testuser";
    CREATE SCHEMA IF NOT EXISTS "testuser";
    --create the schema for the existing user:
    CREATE SCHEMA IF NOT EXISTS "dummyuser";
  4. create an env file woodpecker.env with some settings

    WOODPECKER_OPEN=True
    WOODPECKER_SERVER_ADDR=0.0.0.0:8000
    WOODPECKER_GRPC_ADDR=0.0.0.0:9000
    WOODPECKER_ADMIN=admin
    WOODPECKER_DOCKER_CONFIG=
    WOODPECKER_LOG_LEVEL=trace
    WOODPECKER_HOST=https://woodpecker.my.tld
    WOODPECKER_SESSION_EXPIRES=24h
    WOODPECKER_AGENT_SECRET=xxxxxx
    WOODPECKER_DATABASE_DRIVER=postgres
    WOODPECKER_GITEA=True
    WOODPECKER_GITEA_URL=https://forgejo.my.tld
    WOODPECKER_GITEA_CLIENT=xxxxxx
    WOODPECKER_GITEA_SECRET=xxxxxx
    
    WOODPECKER_LOG_XORM=true
    WOODPECKER_LOG_XORM_SQL=true
  5. run the woodpecker server with the initial database user

    docker run --rm --network woodpecker-test --env-file woodpecker.env -e WOODPECKER_DATABASE_DATASOURCE=postgres://dummyuser:dummypass@wp-test-db/dummydb?sslmode=disable woodpeckerci/woodpecker-server:latest

    This will execute without errors and inspecting the database will show that everything is created in the public schema, even though they should have been created in the users schema.

    Performing a CREATE TABLE 'test-table' (key int); correctly creates a dummyuser.test-table table.

  6. run woodpecker-server against the other database with the other user

    docker run --rm --network woodpecker-test --env-file woodpecker.env -e WOODPECKER_DATABASE_DATASOURCE=postgres://testuser:testpass@wp-test-db/testdb?sslmode=disable woodpeckerci/woodpecker-server:latest

    This fails due to the more restricitve settings of the user previously created.

  7. cleanup the no longer network network
    docker network rm woodpecker-test

Performing the same steps with the image tag v0.1.5 behaves differently, where most of the tables are created in the correct schema, but it still fails with a "permision denied" error, when trying to create the migrations table in the public schema.

6543 commented 8 months ago

I close this issue as code changed a loot now - if similar issue ocure, just create a new issue and link to this one