goharbor / harbor

An open source trusted cloud native registry project that stores, signs, and scans content.
https://goharbor.io
Apache License 2.0
24.18k stars 4.77k forks source link

The database CPU load is high #19681

Closed buxiaomo closed 11 months ago

buxiaomo commented 11 months ago

If you are reporting a problem, please make sure the following information are provided:

Expected behavior and actual behavior: I use an external database on aws, database type postgress, version 12.14, instance type db.m5.xlarge (4C16G)

high SQL

SELECT T0."id", T0."vendor_type", T0."vendor_id", T0."status", T0."status_message", T0."trigger", T0."extra_attrs", T0."start_time", T0."update_time", T0."end_time", T0."revision" FROM "execution" T0 WHERE T0."vendor_type" = $1 AND T0."vendor_id" = $2 ORDER BY T0."start_time" DESC LIMIT ? OFFSET ?

Steps to reproduce the problem: Please provide the steps to reproduce this problem.

Versions: Please specify the versions of following systems.

Additional context:

The IP address or hostname to access admin UI and registry service.

DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients.

hostname: xxxxxxxx

http related config

http:

port for http, default is 80. If https enabled, this port will redirect to https port

port: 80

https related config

https:

https port for harbor, default is 443

port: 443

The path of cert and key files for nginx

certificate: /data/harbor/cert/xxxxxxxx.crt private_key: /data/harbor/cert/xxxxxxxx.key

Uncomment following will enable tls communication between all harbor components

internal_tls:

set enabled to true means internal tls is enabled

enabled: true

put your cert and key files on dir

dir: /etc/harbor/tls/internal

Uncomment external_url if you want to enable external proxy

And when it enabled the hostname will no longer used

external_url: https://reg.mydomain.com:8433

The initial password of Harbor admin

It only works in first time to install harbor

Remember Change the admin password from UI after launching Harbor.

harbor_admin_password: xxxxxxxx

Harbor DB configuration

database:

The password for the root user of Harbor DB. Change this before any production use.

password: xxxxxxxx

The maximum number of connections in the idle connection pool. If it <=0, no idle connections are retained.

max_idle_conns: 50

The maximum number of open connections to the database. If it <= 0, then there is no limit on the number of open connections.

Note: the default number of connections is 1024 for postgres of harbor.

max_open_conns: 1000

The default data volume

data_volume: /data/harbor

Harbor Storage settings by default is using /data dir on local filesystem

Uncomment storage_service setting If you want to using external storage

storage_service:

ca_bundle is the path to the custom root ca certificate, which will be injected into the truststore

of registry's and chart repository's containers. This is usually needed when the user hosts a internal storage with self signed certificate.

ca_bundle:

storage backend, default is filesystem, options include filesystem, azure, gcs, s3, swift and oss

for more info about this configuration please refer https://docs.docker.com/registry/configuration/

filesystem:

maxthreads: 100

set disable to true when you want to disable registry redirect

redirect:

disabled: false

Trivy configuration

#

Trivy DB contains vulnerability information from NVD, Red Hat, and many other upstream vulnerability databases.

It is downloaded by Trivy from the GitHub release page https://github.com/aquasecurity/trivy-db/releases and cached

in the local file system. In addition, the database contains the update timestamp so Trivy can detect whether it

should download a newer version from the Internet or use the cached one. Currently, the database is updated every

12 hours and published as a new release to GitHub.

trivy:

ignoreUnfixed The flag to display only fixed vulnerabilities

ignore_unfixed: false

skipUpdate The flag to enable or disable Trivy DB downloads from GitHub

#

You might want to enable this flag in test or CI/CD environments to avoid GitHub rate limiting issues.

If the flag is enabled you have to download the trivy-offline.tar.gz archive manually, extract trivy.db and

metadata.json files and mount them in the /home/scanner/.cache/trivy/db path.

skip_update: false #

insecure The flag to skip verifying registry certificate

insecure: false

github_token The GitHub access token to download Trivy DB

#

Anonymous downloads from GitHub are subject to the limit of 60 requests per hour. Normally such rate limit is enough

for production operations. If, for any reason, it's not enough, you could increase the rate limit to 5000

requests per hour by specifying the GitHub access token. For more details on GitHub rate limiting please consult

https://developer.github.com/v3/#rate-limiting

#

You can create a GitHub token by following the instructions in

https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line

#

github_token: xxx

jobservice:

Maximum number of job workers in job service

max_job_workers: 10

notification:

Maximum retry count for webhook job

webhook_job_max_retry: 10

chart:

Change the value of absolute_url to enabled can enable absolute url in chart

absolute_url: disabled

Log configurations

log:

options are debug, info, warning, error, fatal

level: info

configs for logs in local storage

local:

Log files are rotated log_rotate_count times before being removed. If count is 0, old versions are removed rather than rotated.

rotate_count: 50
# Log files are rotated only if they grow bigger than log_rotate_size bytes. If size is followed by k, the size is assumed to be in kilobytes.
# If the M is used, the size is in megabytes, and if G is used, the size is in gigabytes. So size 100, size 100k, size 100M and size 100G
# are all valid.
rotate_size: 200M
# The directory on your host that store log
location: /var/log/harbor

Uncomment following lines to enable external syslog endpoint.

external_endpoint:

protocol used to transmit log to external endpoint, options is tcp or udp

protocol: tcp

The host of external endpoint

host: localhost

Port of external endpoint

port: 5140

This attribute is for migrator to detect the version of the .cfg file, DO NOT MODIFY!

_version: 2.2.0

Uncomment external_database if using external database.

ssl_mode "enable"; only "require" (default), "verify-full", "verify-ca", and "disable" supported

external_database: harbor: host: xxxxxxxxxx-north-1.amazonaws.com.cn port: 5432 db_name: xxxxxxxx username: xxxxxxxx password: xxxxxxxx ssl_mode: require max_idle_conns: 2 max_open_conns: 0 notary_signer: host: xxxxxxxxxx-north-1.amazonaws.com.cn port: 5432 db_name: xxxxxxxx username: xxxxxxx password: xxxxxxxx ssl_mode: require notary_server: host: xxxxxxxxxx-north-1.amazonaws.com.cn port: 5432 db_name: xxxxxxxx username: xxxxxxx password: xxxxxxxx ssl_mode: require

Uncomment external_redis if using external Redis server

external_redis: host: redis:6379 password: 'xxxxxxxx'

db_index 0 is for core, it's unchangeable

registry_db_index: 1 jobservice_db_index: 2 chartmuseum_db_index: 3 trivy_db_index: 5 idle_timeout_seconds: 30

Uncomment uaa for trusting the certificate of uaa instance that is hosted via self-signed cert.

uaa:

ca_file: /path/to/ca

Global proxy

Config http proxy for components, e.g. http://my.proxy.com:3128

Components doesn't need to connect to each others via http proxy.

Remove component from components array if want disable proxy

for it. If you want use proxy for replication, MUST enable proxy

for core and jobservice, and set http_proxy and https_proxy.

Add domain to the no_proxy field, when you want disable proxy

for some special registry.

proxy: http_proxy: https_proxy: no_proxy: components:

metric: enabled: true port: 9090 path: /metrics


- **Log files:** You can get them by package the `/var/log/harbor/` .
buxiaomo commented 11 months ago
iShot_2023-12-07_10 25 52
buxiaomo commented 11 months ago
iShot_2023-12-07_10 27 03
buxiaomo commented 11 months ago
iShot_2023-12-07_10 27 30
wy65701436 commented 11 months ago

how many execution records do you have? And to resolve this problem, you have to upgrade to the latest patch, like v2.9.1.

buxiaomo commented 11 months ago
iShot_2023-12-07_11 09 43
stonezdj commented 11 months ago

All these execution records with success status are completed this week? You could reduce the records in execution by delete the success record of IMAGE_SCAN.

buxiaomo commented 11 months ago

All these execution records with success status are completed this week? You could reduce the records in execution by delete the success record of IMAGE_SCAN.

No, these are all records from harbor deployment to the present

stonezdj commented 11 months ago

How many artifacts in your environment?

buxiaomo commented 11 months ago

How many artifacts in your environment?

what is artifacts? we have 178 project,2938 repository, use 7TB storage

buxiaomo commented 11 months ago

After the upgrade to 2.3.0, the CPU stabilized at around 40%