Closed phoinixgrr closed 2 years ago
afaik your version have this trouble when you upgrade from previous versions. look for it - it has dusscussed already somewhere in https://github.com/goharbor/
also you can tune maxIdleConns & maxOpenConns for harbor-harbor-core there is 50 & 1000 by default, my working set have 15 & 175 respectively
Hi, @phoinixgrr how many concurrencies do you use to pull push the images? Are they pulling pushing the images in the same project?
afaik your version have this trouble when you upgrade from previous versions. look for it - it has dusscussed already somewhere in https://github.com/goharbor/
also you can tune maxIdleConns & maxOpenConns for harbor-harbor-core there is 50 & 1000 by default, my working set have 15 & 175 respectively
Hi @paulliss, thanks for replying ,
So, there is an issue with migrating the database when updating from older versions? And this can cause this performance regression / 5xx on harbor side? Could you point me to the Issue, I seen to have trouble founding it :(
We've opted for not re-using open connections. (drop Connection pooling) with:
- name: POSTGRESQL_MAX_IDLE_CONNS value: "-1"
- name: POSTGRESQL_MAX_OPEN_CONNS value: "-1"
Hi, @phoinixgrr how many concurrencies do you use to pull the images? Are they pulling the images in the same project?
Hey @heww, Thanks for stepping into this. We are using Docker to perform pushes and pulls. The defaults are in use, (3 concurrent connection for pulling, and 5 concurrent connection for pushing)
--max-concurrent-downloads int Set the max concurrent downloads for each pull (default 3)
--max-concurrent-uploads int Set the max concurrent uploads for each push (default 5)
Yes, most of our images are under the same Project (95% + )
Hey, @phoinixgrr. I'll try to find the link out. One more question: what kind of store do you use for your images/charts?
Hey, @phoinixgrr. I'll try to find the link out. One more question: what kind of store do you use for your images/charts?
Object storage. Amazon S3 @paulliss
@phoinixgrr here the link: https://github.com/goharbor/harbor/issues/14358 But I made a mistake it all goes about upgrade to 2.2.0 :(
See if the service harbor-harbor-core has a restarts?
@phoinixgrr here the link: #14358 But I made a mistake it all goes about upgrade to 2.2.0 :(
See if the service harbor-harbor-core has a restarts?
No restarts under harbor-core
@paulliss
@heww
Since we are not using Project Quotas
, Is it possible to disable those queries in order to avoid this extreme locking from taking place?
SELECT "id", "reference", "reference_id", "hard", "creation_time", "update_time" FROM "quota" WHERE "reference" = $1 AND "reference_id" = $2 FOR UPDATE
SELECT "id", "reference", "reference_id", "hard", "creation_time", "update_time" FROM "quota" WHERE "reference" = $1 AND "reference_id" = $2 FOR UPDATE
SELECT "id", "reference", "reference_id", "hard", "creation_time", "update_time" FROM "quota" WHERE "reference" = $1 AND "reference_id" = $2 FOR UPDATE
@phoinixgrr any news?
Just remembered that I was in such trouble when there were not enough IOPS, because of my SATA disks
Hello, if you do not use quota feature, you can disable from api to mitigate your errors , look the swagger api
@phoinixgrr any news?
Just remembered that I was in such trouble when there were not enough IOPS, because of my SATA disks
@paulliss There is no observed IOWAIT on any of the subsystems, either Harbor itself, nor the RDS Database. Storage is performant.
Other than that, things are not looking good. When this extreme locking to the Database is performed, Harbor starts throwing 5xx errors to image pushes. The following query has been identified to produce the locking:
SELECT "id", "reference", "reference_id", "hard", "creation_time", "update_time" FROM "quota" WHERE "reference" = $1 AND "reference_id" = $2 FOR UPDATE
Hello, if you do not use quota feature, you can disable from api to mitigate your errors , look the swagger api
@aladjadj
Quotas are reported as disabled (-1
) from the API:
[
{
"id": 100,
"ref": {
"id": 464,
"name": "------",
"owner_name": "admin"
},
"creation_time": "2021-03-19T10:14:01.046001Z",
"update_time": "2021-03-19T10:21:04.23482Z",
"hard": {
"storage": -1
},
"used": {
"storage": 5607868043
}
},
{
"id": 67,
"ref": {
"id": 167,
"name": "------",
"owner_name": "admin"
},
"creation_time": "2021-03-03T11:03:59.948701Z",
"update_time": "2021-03-24T09:33:17.246179Z",
"hard": {
"storage": -1
},
"used": {
"storage": 80959601
}
},
{
"id": 45,
"ref": {
"id": 56,
"name": "------",
"owner_name": "admin"
},
"creation_time": "2021-01-27T14:10:31.390346Z",
"update_time": "2021-03-17T09:22:06.392402Z",
"hard": {
"storage": -1
},
"used": {
"storage": 998161523
}
},
{
"id": 44,
"ref": {
"id": 55,
"name": "------",
"owner_name": "admin"
},
"creation_time": "2021-01-27T14:09:10.222079Z",
"update_time": "2021-03-17T09:22:06.365389Z",
"hard": {
"storage": -1
},
"used": {
"storage": 0
}
},
{
"id": 41,
"ref": {
"id": 43,
"name": "------",
"owner_name": "admin"
},
"creation_time": "2021-01-20T11:05:34.104598Z",
"update_time": "2021-03-17T09:22:06.349697Z",
"hard": {
"storage": -1
},
"used": {
"storage": 53979915
}
},
{
"id": 37,
"ref": {
"id": 37,
"name": "------",
"owner_name": "admin"
},
"creation_time": "2021-01-19T12:27:09.413684Z",
"update_time": "2021-06-07T07:57:46.678234Z",
"hard": {
"storage": -1
},
"used": {
"storage": 4569325322201
}
},
{
"id": 36,
"ref": {
"id": 36,
"name": "------",
"owner_name": "admin"
},
"creation_time": "2021-01-19T12:01:19.862108Z",
"update_time": "2021-04-20T06:46:28.082002Z",
"hard": {
"storage": -1
},
"used": {
"storage": 119763371
}
}
]
Hi @phoinixgrr -1 is not used to disable the quota.
Please call /api/internal/switchquota
API using admin to disable the quota feature in the harbor.
curl -vkL -X PUT -u admin:Harbor12345 -H "Content-Type: application/json" -d '{"Enabled": false}' https://host/api/internal/switchquota
@heww
Calling API endpoint /api/internal/switchquota
actually fixed our extreme locking! Awesome. Seems to be working as a workaround in our case.
Question: Should this indicate an issue with the way harbor manages the Database?
Since then, we've performed a cleanup on our largest project, by deleting all untagged artifacts (using
TAG RETENTION rule ) And we have a new kind of HTTP.500
issues thrown, when pushes happens.
We've managed to drill-down the logs to identify this new issue.
Docker client gets HTTP 500 Internal Server error, from harbor, when PUSHING
:
Jun 8 07:28:25 ip-10-30-2-156 dockerd[541]: level=info msg="Attempting next endpoint for push after error: received unexpected HTTP status: 500 Internal Server Error"
Nginx-ingress confirms the 500 thrown by Harbor:
2021-06-08 07:28:25 192.168.5.196 - - [08/Jun/2021:07:28:25 +0000] "PUT /v2/beat/[....] HTTP/1.1" 500 66 "-" "docker/[...] git-commit/[...]-aws os/linux arch/amd64 UpstreamClient(Docker-Client[...])" 6844 0.460 [harbor-harbor-core-http] [] 100.97.81.117:8080 66 0.460 500 95ef3bdd3ff37311192eb8eea8bf9927
Harbor logs suggest Database related issues:
2021-06-08T07:28:25Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
PostgrsSQL logs, show duplicate keys/ unique constraint violated.
2021-06-08 07:28:25 UTC:10.1.20.13(16391):harbor@core_db:[2307]:ERROR: duplicate key value violates unique constraint "unique_artifact"
2021-06-08 07:28:25 UTC:10.1.20.13(16391):harbor@core_db:[2307]:DETAIL: Key (repository_id, digest)=(311, sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76) already exists.
2021-06-08 07:28:25 UTC:10.1.20.13(16391):harbor@core_db:[2307]:STATEMENT: INSERT INTO "artifact" ("type", "media_type", "manifest_media_type", "project_id", "repository_id", "repository_name", "digest", "size", "icon", "push_time", "pull_time", "extra_attrs", "annotations") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13) RETURNING "id"
2021-06-08 07:28:25 UTC:10.1.20.13(16391):harbor@core_db:[2307]:ERROR: current transaction is aborted, commands ignored until end of transaction block
As conclusion, we've managed to Workaround our main issue concerning the extreme database locking happening.
Unfortunately, it seems like the cleaning performed on the untagged artifacts, has introduced new issues with the database, related to duplicate_keys (unique_artifact
)
According to your logs, can you confirm that do you have pushing actions during deleting all untagged artifacts?
Could you provide the count of the records in the quota table?
Taking about 13985.656ms to execute a SELECT FOR UPDATE
SQL is not normal. Can you confirm that there isn't too much pushing for your harbor instance?
How much CPU and memory the Postgres using when you got this problem?
Hey @heww , thanks for your input.
According to your logs, can you confirm that do you have pushing actions during deleting all untagged artifacts?
During the duration of the cleanup which lasted 337min performed yesterday, there were pushing actions performed.
But in today's specific logs that we've shared (2021-06-08 07:28:25 UTC
) there was no cleanup action on-going.
Could you provide the count of the records in the quota table?
core_db=> select COUNT(*) from quota;
count
-------
7
(1 row)
Taking about 13985.656ms to execute a
SELECT FOR UPDATE
SQL is not normal. Can you confirm that there isn't too much pushing for your harbor instance?
Nginx-Ingress HTTP/PUT requests forwarded to harbor, last 24h:
Nginx-Ingress HTTP/PUT requests forwarded to harbor, last 7d:
How much CPU and memory the Postgres using when you got this problem?
Problem is observed when we get > 250-300 Lock Tables, when harbor runs SELECT "id", "reference", "reference_id", "hard", "creation_time", "update_time" FROM "quota" WHERE "reference" = $1 AND "reference_id" = $2 FOR UPDATE
on the database :
DB CPU/MEM usage, at the same time window as the above graph, seems nominal :
This is a 16VCPUs and 122GB Ram RDS DB Instance, dedicated to Harbor.
Just to make sure we are on the same page, the locking issue is now mitigated with https://github.com/goharbor/harbor/issues/15048#issuecomment-855767153 as proposed.
We now have a new issue concerning the database, with duplicate_keys (unique_artifact), that seem to came up after the cleanup performed with untagged artifacts. Would you advise opening a new issue about this?
@phoinixgrr For the duplicate key
issue, is it happening for all image pushing or just for some specific images?
Could you run the following sql and provide us the output:
select * from repository where repository_id= 311;
select * from repository where name = (select name from repository where repository_id=311);
select * from artifact where repository_name=(select name from repository where repository_id=311) and digest='sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76';
select * from artifact where repository_id=311 and digest='sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76';
BTW, is your Harbor instance a fresh installation or upgrade from an old version?
Hey @ywk253100 , thanks for stepping in.
@phoinixgrr For the
duplicate key
issue, is it happening for all image pushing or just for some specific images?
This seems to be completely random. Moreover, re-running the same push, succeeds.
It seems to have started after cleaning up all untagged artifacts from our biggest project (2021-06-07):
Occurrences latest 30 days:
2021-06-09T07:24:31Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
2021-06-09T02:27:24Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
2021-06-08T12:24:23Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
2021-06-08T12:21:29Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
2021-06-08T07:28:25Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
2021-06-08T07:17:06Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
2021-06-08T06:18:49Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
2021-06-08T06:18:49Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
Could you run the following sql and provide us the output:
select * from repository where repository_id= 311; select * from repository where name = (select name from repository where repository_id=311); select * from artifact where repository_name=(select name from repository where repository_id=311) and digest='sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76'; select * from artifact where repository_id=311 and digest='sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76';
core_db=> select * from repository where repository_id=311; repository_id | name | project_id | description | pull_count | star_count | creation_time | update_time ---------------+-----------+------------+-------------+------------+------------+----------------------------+--------------------------- 311 | beat/[...] | 37 | | 142784 | 0 | 2021-01-26 10:38:04.354809 | 2021-06-09 08:37:33.40615 (1 row)
core_db=> select * from repository where name = (select name from repository where repository_id=311); repository_id | name | project_id | description | pull_count | star_count | creation_time | update_time ---------------+-----------+------------+-------------+------------+------------+----------------------------+--------------------------- 311 | beat/[...] | 37 | | 142784 | 0 | 2021-01-26 10:38:04.354809 | 2021-06-09 08:37:33.40615 (1 row)
core_db=> select * from artifact where repository_name=(select name from repository where repository_id=311) and digest='sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76'; id | project_id | repository_name | digest | type | pull_time | push_time | repository_id | media_type | manifest_media_type | size 541303 |
37 | beat/[...] | sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76 | IMAGE | 2021-06-08 07:29:17.367729 | 2021-06-08 07:28:25.469699 | 311 | application/vnd.docker.container.image.v1+json | application/vnd.docker.distribution.manifest.v2+json | 233497922 | {[....]"StopSignal":"SIGQUIT"},"created":"2021-06-08T07:28:15.044442783Z","os":"linux"} |
---|
(1 row)
core_db=> select * from artifact where repository_id=311 and digest='sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76'; id | project_id | repository_name | digest | type | pull_time | push_time | repository_id | media_type | manifest_media_type | size tions |
icon 541303 | 37 | beat/[...] | sha256:1c45e91bc8eefd723d37e3ea7def56de111d617349f47abc89181243ba060c76 | IMAGE | 2021-06-08 07:29:17.367729 | 2021-06-08 07:28:25.469699 | 311 | application/vnd.docker.container.image.v1+json | application/vnd.docker.distribution.manifest.v2+json | 233497922 | {[...]"StopSignal":"SIGQUIT"},"created":"2021-06-08T07:28:15.044442783Z","os":"linux"} |
---|
(1 row)
> BTW, is your Harbor instance a fresh installation or upgrade from an old version?
This is an upgrade from an old version. Specifically, this is the upgrade path `2.1.3 > 2.2.0 > 2.2.1 > 2.2.2`
@phoinixgrr Is there concurrent pushing in your env, such as triggered by CI/CD?
@phoinixgrr Is there concurrent pushing in your env, such as triggered by CI/CD?
@ywk253100 Yes there is. 95% of the pushes happen by CI/CD. We are continuously testing/building/deploying, therefore there is high likelihood for concurrent image pushes. Moreover the defaults for docker client concurrency settings are in effect:
--max-concurrent-downloads int Set the max concurrent downloads for each pull (default 3)
--max-concurrent-uploads int Set the max concurrent uploads for each push (default 5)
Today's HTTP.500 occurrence:
DockerClientLog:
+ docker push registry.priv[....]/beat/[....]
The push refers to repository [registry.priv[....]/beat/[...]]
449af51d3268: Preparing
44ee68603e8e: Preparing
44ee68603e8e: Layer already exists
449af51d3268: Pushed
received unexpected HTTP status: 500 Internal Server Error
NginxIngress Log (client received HTTP 500 from harbor during image push):
2021-06-09 07:24:31
192.168.5.23 - - [09/Jun/2021:07:24:31 +0000] "PUT /v2/beat/[...] HTTP/1.1" 500 66 "-" "docker[...] go/go[...] git-commit/[...] \x5C(linux\x5C))" 2074 0.488 [harbor-harbor-core-http] [] 100.97.80.156:8080 66 0.488 500 8f0a161a4dfc24e4b28e5e426ad8842e
HarborLog:
2021-06-09T07:24:31Z [ERROR] [/lib/http/error.go:54]: {"errors":[{"code":"UNKNOWN","message":"unknown: pq: current transaction is aborted, commands ignored until end of transaction block"}]}
PostgresLog:
1623223471000,"2021-06-09 07:24:31 UTC:10.1.15.167(45471):harbor@core_db:[21482]:ERROR: duplicate key value violates unique constraint ""unique_artifact""",mgmt-harbor-serverless-prod.1
1623223471000,"2021-06-09 07:24:31 UTC:10.1.15.167(45471):harbor@core_db:[21482]:DETAIL: Key (repository_id, digest)=(1955, sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1) already exists.",mgmt-harbor-serverless-prod.1
1623223471000,"2021-06-09 07:24:31 UTC:10.1.15.167(45471):harbor@core_db:[21482]:STATEMENT: INSERT INTO ""artifact"" (""type"", ""media_type"", ""manifest_media_type"", ""project_id"", ""repository_id"", ""repository_name"", ""digest"", ""size"", ""icon"", ""push_time"", ""pull_time"", ""extra_attrs"", ""annotations"") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13) RETURNING ""id""",mgmt-harbor-serverless-prod.1
1623223471000,"2021-06-09 07:24:31 UTC:10.1.15.167(45471):harbor@core_db:[21482]:ERROR: current transaction is aborted, commands ignored until end of transaction block",mgmt-harbor-serverless-prod.1
1623223471000,"2021-06-09 07:24:31 UTC:10.1.15.167(45471):harbor@core_db:[21482]:STATEMENT: SELECT T0.""id"", T0.""type"", T0.""media_type"", T0.""manifest_media_type"", T0.""project_id"", T0.""repository_id"", T0.""repository_name"", T0.""digest"", T0.""size"", T0.""icon"", T0.""push_time"", T0.""pull_time"", T0.""extra_attrs"", T0.""annotations"" FROM ""artifact"" T0 WHERE T0.""digest"" = $1 AND T0.""repository_name"" = $2 ",mgmt-harbor-serverless-prod.1
1623223471000,2021-06-09 07:24:31 UTC:10.1.22.177(54505):harbor@core_db:[21473]:LOG: disconnection: session time: 0:00:00.557 user=harbor database=core_db host=10.1.22.177 port=54505,mgmt-harbor-serverless-prod.1
@phoinixgrr Could you check the logs of Nginx to find out that whether there is another same API request PUT /v2/beat/[...]
which got a 2xx response at the same period with the one got 500 error?
@ywk253100
So at the time of the HTTP.500 failure, there were a total of 5 image pushes under the same repo, with different tagging, triggered by CI/CD. These were pushes with 30 seconds to 0 seconds difference between them.
1) 192.168.5.23 - - [09/Jun/2021:07:24:31 +0000] "PUT /v2/beat/repo_name/manifests/phobos HTTP/1.1" 500 66 "-" "docker/[...] go/go[...] git-commit/[...]" 2074 0.488 [harbor-harbor-core-http] [] 100.97.80.156:8080 66 0.488 500 8f0a161a4dfc24e4b28e5e426ad8842e
2) 192.168.5.23 - - [09/Jun/2021:07:24:31 +0000] "PUT /v2/beat/repo_name/manifests/deimos HTTP/1.1" 201 0 "-" "docker/[...] go/go[...] git-commit/[...]" 2074 0.561 [harbor-harbor-core-http] [] 100.97.81.251:8080 0 0.560 201 85070e11d9baf70eef7f563537bcd0b0
3) 192.168.3.168 - - [09/Jun/2021:07:24:19 +0000] "PUT /v2/beat/repo_name/manifests/kratos HTTP/1.1" 201 0 "-" "docker/[...] go/go[...] git-commit/[...]" 2075 0.451 [harbor-harbor-core-http] [] 100.97.80.245:8080 0 0.448 201 5aa3781f5aacfd784eb016a6ea9721f7
4) 192.168.5.41 - - [09/Jun/2021:07:24:16 +0000] "PUT /v2/beat/repo_name/manifests/thanos HTTP/1.1" 201 0 "-" "docker/[...] go/go[...] git-commit/[...]" 2074 0.454 [harbor-harbor-core-http] [] 100.97.80.245:8080 0 0.452 201 cf204480e80baac7faf10c7d74f3a45c
5) 192.168.5.37 - - [09/Jun/2021:07:24:09 +0000] "PUT /v2/beat/repo_name/manifests/zeus HTTP/1.1" 201 0 "-" "docker/[...] go/go[...] git-commit/[...]" 2074 0.529 [harbor-harbor-core-http] [] 100.97.81.251:8080 0 0.528 201 15af31bc1e4c196969ac0d373f4f5880
Push /v2/beat/repo_name/manifests/phobos
and /v2/beat/repo_name/manifests/deimos
seem to have happened at the same second, and triggered by the same host.
Push /v2/beat/repo_name/manifests/deimos
succeded with HTTP/201
reply by harbor
Push /v2/beat/repo_name/manifests/phobos
failed with HTTP/500
thrown by harbor.
And here is the view from Harbor GUI:
Showing all images updated with the same "Push Time" , except "phobos" , since it failed with HTTP/500.
digests of phobos and deimos are not same.
does digest e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 have other tags ?
can you grep e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1
in the logs
digests of phobos and deimos are not same. does digest e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 have other tags ? can you grep
e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1
in the logs
Hi @bitsf . Thanks for pitching In
The requested logging:
2021-06-09T07:25:08Z [INFO] [/pkg/notifier/notifier.go:204]: Handle notification with Handler 'ArtifactWebhook' on topic 'PULL_ARTIFACT': ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator- OccurAt-2021-06-09 07:25:08
2021-06-09T07:25:08Z [DEBUG] [/controller/event/handler/webhook/artifact/artifact.go:76]: cannot find policy for PULL_ARTIFACT event: ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator- OccurAt-2021-06-09 07:25:08
2021-06-09T07:25:08Z [INFO] [/pkg/notifier/notifier.go:204]: Handle notification with Handler 'AuditLog' on topic 'PULL_ARTIFACT': ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator- OccurAt-2021-06-09 07:25:08
2021-06-09T07:25:08Z [INFO] [/pkg/notifier/notifier.go:204]: Handle notification with Handler 'InternalArtifact' on topic 'PULL_ARTIFACT': ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator- OccurAt-2021-06-09 07:25:08
2021-06-09T07:24:32Z [INFO] [/pkg/notifier/notifier.go:204]: Handle notification with Handler 'P2PPreheat' on topic 'PUSH_ARTIFACT': ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator-robot$registry OccurAt-2021-06-09 07:24:31
2021-06-09T07:24:31Z [INFO] [/pkg/notifier/notifier.go:204]: Handle notification with Handler 'Replication' on topic 'PUSH_ARTIFACT': ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator-robot$registry OccurAt-2021-06-09 07:24:31
2021-06-09T07:24:31Z [INFO] [/pkg/notifier/notifier.go:204]: Handle notification with Handler 'ArtifactWebhook' on topic 'PUSH_ARTIFACT': ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator-robot$registry OccurAt-2021-06-09 07:24:31
2021-06-09T07:24:31Z [DEBUG] [/controller/event/handler/webhook/artifact/artifact.go:76]: cannot find policy for PUSH_ARTIFACT event: ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator-robot$registry OccurAt-2021-06-09 07:24:31
2021-06-09T07:24:31Z [INFO] [/pkg/notifier/notifier.go:204]: Handle notification with Handler 'AuditLog' on topic 'PUSH_ARTIFACT': ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator-robot$registry OccurAt-2021-06-09 07:24:31
2021-06-09T07:24:31Z [DEBUG] [/controller/event/handler/p2p/preheat.go:72]: preheat: artifact pushed beat/repo_name:[deimos]@sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1
2021-06-09T07:24:31Z [INFO] [/pkg/notifier/notifier.go:204]: Handle notification with Handler 'InternalArtifact' on topic 'PUSH_ARTIFACT': ID-543705, Repository-beat/repo_name Tags-[deimos] Digest-sha256:e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1 Operator-robot$registry OccurAt-2021-06-09 07:24:31
2021-06-09 07:24:31 time="2021-06-09T07:24:31.607906571Z" level=debug msg="s3aws.PutContent("/docker/registry/v2/repositories/beat/repo_name/_manifests/tags/phobos/index/sha256/e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1/link")" auth.user.name="harbor_registry_user" go.version=go[...] http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=registry.private[...] http.request.id=7ae27fd7-8747-4c6a-9e78-7b25b8c14cf2 http.request.method=PUT http.request.remoteaddr=192.168.5.23 http.request.uri="/v2/beat/repo_name/manifests/phobos" http.request.useragent="docker/[...] go/[...] git-commit/[...]-0ubuntu1~[...] kernel/[...] os/linux arch/amd64 UpstreamClient(Docker-Client/[...] \(linux\))" trace.duration=16.71554ms trace.file="/bitnami/blacksmith-sandox/docker-distribution-2.7.1/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).PutContent" trace.id=1f4d0de6-1d05-47c2-bd6f-840482ac92a2 trace.line=110 vars.name="beat/repo_name" vars.reference=phobos
2021-06-09 07:24:31 time="2021-06-09T07:24:31.59106436Z" level=debug msg="s3aws.PutContent("/docker/registry/v2/repositories/beat/repo_name/_manifests/revisions/sha256/e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1/link")" auth.user.name="harbor_registry_user" go.version=go[...] http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=registry.private[...] http.request.id=7ae27fd7-8747-4c6a-9e78-7b25b8c14cf2 http.request.method=PUT http.request.remoteaddr=192.168.5.23 http.request.uri="/v2/beat/repo_name/manifests/phobos" http.request.useragent="docker/[...] go/[...] git-commit/[...]-0ubuntu1~[...] kernel/[...] os/linux arch/amd64 UpstreamClient(Docker-Client/[...] \(linux\))" trace.duration=23.835078ms trace.file="/bitnami/blacksmith-sandox/docker-distribution-2.7.1/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).PutContent" trace.id=0698f6d9-7332-437e-89ee-b7976974d765 trace.line=110 vars.name="beat/repo_name" vars.reference=phobos
2021-06-09 07:24:31 time="2021-06-09T07:24:31.590042543Z" level=debug msg="s3aws.PutContent("/docker/registry/v2/repositories/beat/repo_name/_manifests/tags/deimos/index/sha256/e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1/link")" auth.user.name="harbor_registry_user" go.version=go[...] http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=registry.private[...] http.request.id=04ec8916-d743-4cd2-9f86-834468256266 http.request.method=PUT http.request.remoteaddr=192.168.5.23 http.request.uri="/v2/beat/repo_name/manifests/deimos" http.request.useragent="docker/[...] go/[...] git-commit/[...]-0ubuntu1~[...] kernel/[...] os/linux arch/amd64 UpstreamClient(Docker-Client/[...] \(linux\))" trace.duration=18.819055ms trace.file="/bitnami/blacksmith-sandox/docker-distribution-2.7.1/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).PutContent" trace.id=f61897d7-0987-49b1-a5ac-433dff345d73 trace.line=110 vars.name="beat/repo_name" vars.reference=deimos
2021-06-09 07:24:31 time="2021-06-09T07:24:31.570963693Z" level=debug msg="s3aws.PutContent("/docker/registry/v2/repositories/beat/repo_name/_manifests/revisions/sha256/e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1/link")" auth.user.name="harbor_registry_user" go.version=go[...] http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=registry.private[...] http.request.id=04ec8916-d743-4cd2-9f86-834468256266 http.request.method=PUT http.request.remoteaddr=192.168.5.23 http.request.uri="/v2/beat/repo_name/manifests/deimos" http.request.useragent="docker/[...] go/[...] git-commit/[...]-0ubuntu1~[...] kernel/[...] os/linux arch/amd64 UpstreamClient(Docker-Client/[...] \(linux\))" trace.duration=17.101312ms trace.file="/bitnami/blacksmith-sandox/docker-distribution-2.7.1/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).PutContent" trace.id=31a124b2-51b0-4281-b695-c728335ff764 trace.line=110 vars.name="beat/repo_name" vars.reference=deimos
2021-06-09 07:24:31 time="2021-06-09T07:24:31.567185219Z" level=debug msg="s3aws.PutContent("/docker/registry/v2/blobs/sha256/e1/e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1/data")" auth.user.name="harbor_registry_user" go.version=go[...] http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=registry.private[...] http.request.id=7ae27fd7-8747-4c6a-9e78-7b25b8c14cf2 http.request.method=PUT http.request.remoteaddr=192.168.5.23 http.request.uri="/v2/beat/repo_name/manifests/phobos" http.request.useragent="docker/[...] go/[...] git-commit/[...]-0ubuntu1~[...] kernel/[...] os/linux arch/amd64 UpstreamClient(Docker-Client/[...] \(linux\))" trace.duration=27.177579ms trace.file="/bitnami/blacksmith-sandox/docker-distribution-2.7.1/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).PutContent" trace.id=f98e95c3-f9cd-41b8-98b3-2a77131ba099 trace.line=110 vars.name="beat/repo_name" vars.reference=phobos
2021-06-09 07:24:31 time="2021-06-09T07:24:31.553802989Z" level=debug msg="s3aws.PutContent("/docker/registry/v2/blobs/sha256/e1/e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1/data")" auth.user.name="harbor_registry_user" go.version=go[...] http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=registry.private[...] http.request.id=04ec8916-d743-4cd2-9f86-834468256266 http.request.method=PUT http.request.remoteaddr=192.168.5.23 http.request.uri="/v2/beat/repo_name/manifests/deimos" http.request.useragent="docker/[...] go/[...] git-commit/[...]-0ubuntu1~[...] kernel/[...] os/linux arch/amd64 UpstreamClient(Docker-Client/[...] \(linux\))" trace.duration=21.441213ms trace.file="/bitnami/blacksmith-sandox/docker-distribution-2.7.1/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).PutContent" trace.id=5a079dfd-314a-41a9-8b2e-e60ea4f905f7 trace.line=110 vars.name="beat/repo_name" vars.reference=deimos
2021-06-09 07:24:31 time="2021-06-09T07:24:31.539964795Z" level=debug msg="s3aws.Stat("/docker/registry/v2/blobs/sha256/e1/e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1/data")" auth.user.name="harbor_registry_user" go.version=go[...] http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=registry.private[...] http.request.id=7ae27fd7-8747-4c6a-9e78-7b25b8c14cf2 http.request.method=PUT http.request.remoteaddr=192.168.5.23 http.request.uri="/v2/beat/repo_name/manifests/phobos" http.request.useragent="docker/[...] go/[...] git-commit/[...]-0ubuntu1~[...] kernel/[...] os/linux arch/amd64 UpstreamClient(Docker-Client/[...] \(linux\))" trace.duration=9.76884ms trace.file="/bitnami/blacksmith-sandox/docker-distribution-2.7.1/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).Stat" trace.id=2f00905e-891b-4fbf-911c-f4a1adae1373 trace.line=155 vars.name="beat/repo_name" vars.reference=phobos
2021-06-09 07:24:31 time="2021-06-09T07:24:31.532295856Z" level=debug msg="s3aws.Stat("/docker/registry/v2/blobs/sha256/e1/e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1/data")" auth.user.name="harbor_registry_user" go.version=go[...] http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=registry.private[...] http.request.id=04ec8916-d743-4cd2-9f86-834468256266 http.request.method=PUT http.request.remoteaddr=192.168.5.23 http.request.uri="/v2/beat/repo_name/manifests/deimos" http.request.useragent="docker/[...] go/[...] git-commit/[...]-0ubuntu1~[...] kernel/[...] os/linux arch/amd64 UpstreamClient(Docker-Client/[...] \(linux\))" trace.duration=33.502592ms trace.file="/bitnami/blacksmith-sandox/docker-distribution-2.7.1/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).Stat" trace.id=6de73caf-15bb-4c9c-abf8-382bfcd46ebe trace.line=155 vars.name="beat/repo_name" vars.reference=deimos
According to the logs, the images beat/repo_name:phobos
and beat/repo_name:deimos
have the same digest sha256: e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1
When pushing the images with the same digest into the same repository, it may hit a bug.
@phoinixgrr Thanks for your input. I'll create a PR to fix it
According to the logs, the images
beat/repo_name:phobos
andbeat/repo_name:deimos
have the same digestsha256: e1aee8bb773a4615023387b261d1e93feef5e6ed2d903004813032a01464bad1
When pushing the images with the same digest into the same repository, it may hit a bug.
@phoinixgrr Thanks for your input. I'll create a PR to fix it
Awesome 🚀
Let me know if you end-up building and publishing, a harbor image containing the hotfix. We can deploy it from our side, to confirm that it is indeed fixed. We hit this 3-4 times per day.
Best
pr https://github.com/goharbor/harbor/pull/15104 is fixing this
pr #15104 is fixing this
Awesome @bitsf . Thanks for the update.
There were two(2) issues pointed out:
harbor@core_db:[21602]:LOG: duration: 13985.656 ms execute <unnamed>: SELECT "id", "reference", "reference_id", "hard", "creation_time", "update_time" FROM "quota" WHERE "reference" = $1 AND "reference_id" = $2 FOR UPDATE
This was worked-around with https://github.com/goharbor/harbor/issues/15048#issuecomment-855767153 ERROR: duplicate key value violates unique constraint ""unique_artifact""
Correct me if I am wrong, the PR addreses issue 2. Are there any plans to improve/work on 1 ?
Running Harbor
v2.2.1,
and using an AWS RDS as database. Some specs for our Harbor deployment:We are observing high database usage generated from harbor. We've enabled PostgreSQL
slow_log,
to gather some insights on the database usage. The following queries are reported with duration over 10seconds:These are
FOR UPDATE
queries in which causes the rows retrieved by the SELECT statement to be locked as though for update.Database Locking graph confirms the DB locking issue, showing 400+ locked tables.
When this happens, we are sporadically getting
HTTP.5xx
errors thrown by harbor onPUT
Requests, thus image upload fails for clients:and at the same time Harbor throws errors that suggest issues with the database:
Database specs are upgraded to be able to cope with this. Currently running under 16VCPUs and 122GB Ram, dedicated to Harbor.
Could this suggest an issue with the way that Harbor is managing the Database?