trustification / trustify

Apache License 2.0
10 stars 19 forks source link

CSAF upload: duplicate key value violates unique constraint "package_type_namespace_name_key" #1036

Open helio-frota opened 3 days ago

helio-frota commented 3 days ago

This is happening on current main branch

Steps:

( I got those files via an old version before this with the command cargo run --bin xtask generate-dump -w .... )

➜  trustify git:(main) ✗ cat .trustify/data/start.log
2024-11-21 13:22:55.767 -03 [177974] LOG:  starting PostgreSQL 16.3 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-11-21 13:22:55.767 -03 [177974] LOG:  listening on IPv6 address "::1", port 40575
2024-11-21 13:22:55.767 -03 [177974] LOG:  listening on IPv4 address "127.0.0.1", port 40575
2024-11-21 13:22:55.767 -03 [177974] LOG:  listening on Unix socket "/tmp/.s.PGSQL.40575"
2024-11-21 13:22:55.769 -03 [177977] LOG:  database system was shut down at 2024-11-21 13:22:52 -03
2024-11-21 13:22:55.771 -03 [177974] LOG:  database system is ready to accept connections
2024-11-21 13:24:36.553 -03 [178033] ERROR:  duplicate key value violates unique constraint "package_type_namespace_name_key"
2024-11-21 13:24:36.553 -03 [178033] DETAIL:  Key (type, namespace, name)=(rpm, redhat, mediawiki) already exists.
2024-11-21 13:24:36.553 -03 [178033] STATEMENT:  INSERT INTO "base_purl" ("id", "type", "namespace", "name") VALUES ($1, $2, $3, $4) ON CONFLICT ("id") DO NOTHING RETURNING "id"
2024-11-21 13:24:36.553 -03 [178031] ERROR:  duplicate key value violates unique constraint "package_type_namespace_name_key"
2024-11-21 13:24:36.553 -03 [178031] DETAIL:  Key (type, namespace, name)=(rpm, redhat, mediawiki) already exists.
2024-11-21 13:24:36.553 -03 [178031] STATEMENT:  INSERT INTO "base_purl" ("id", "type", "namespace", "name") VALUES ($1, $2, $3, $4) ON CONFLICT ("id") DO NOTHING RETURNING "id"
2024-11-21 13:24:36.553 -03 [178034] ERROR:  duplicate key value violates unique constraint "package_type_namespace_name_key"
2024-11-21 13:24:36.553 -03 [178034] DETAIL:  Key (type, namespace, name)=(rpm, redhat, mediawiki) already exists.
2024-11-21 13:24:36.553 -03 [178034] STATEMENT:  INSERT INTO "base_purl" ("id", "type", "namespace", "name") VALUES ($1, $2, $3, $4) ON CONFLICT ("id") DO NOTHING RETURNING "id"
ctron commented 3 days ago

does this cause any errors of the upload process?

helio-frota commented 3 days ago

yes

2024-11-22_07-01

And that is random influenced by concurrency (still don't know what part of the code is causing it). When I use this test to 'simulate' the browser upload with multiple files, selecting only these files i shared in the previous comment I can see a % of 80% pass 20% error locally, via browser is the contrary 20% pass 80% error. ( not exactly 80-20 ofc, I'm just sharing the behavior situation )

ctron commented 3 days ago

Ah I see. Ok, then I'd suggest to check if we can turn this into an "upsert". "insert … on conflict … do nothing".

helio-frota commented 3 days ago

yeah thanks but I already tried this with no success as the table contains 2 constraints, and currently using the package_pkey for this.

rsql> .describe base_purl
  Column   |    Type     | Not null | Default
-----------+-------------+----------+---------
 id        | uuid        | No       |
 timestamp | timestamptz | Yes      | now()
 type      | varchar     | No       |
 namespace | varchar     | Yes      |
 name      | varchar     | No       |

Indexes
              Index              |        Columns        | Unique
---------------------------------+-----------------------+--------
 base_purl_id_idx                | id                    | No
 basepurlnameginidx              | name                  | No
 basepurlnamespaceginidx         | namespace             | No
 basepurltypeginidx              | type                  | No
 package_pkey                    | id                    | Yes
 package_type_namespace_name_key | namespace, name, type | Yes

Unless I'm doing something wrong with sea-orm, when I change something related with on_conflict it fails with the other constraint package_type_namespace_name_key