greenbone / gvmd

Greenbone Vulnerability Manager - The database backend for the Greenbone Community Edition
GNU Affero General Public License v3.0
286 stars 153 forks source link

[9.0.0] Invalid UTF-8 characters in database cause migration to fail #900

Closed mettatu closed 3 years ago

mettatu commented 4 years ago

Expected behavior

During upgrade from GVMD 8.0.2 to 9.0.0, the database migration would succeed from revision 205 to 221.

Actual behavior

An error "sql_exec_internal: PQexec failed: ERROR: invalid byte sequence for encoding "UTF8": 0xb8" was logged during upgrade from revision 213 to 214.

Steps to reproduce

  1. results and public.report_host_details table contain a certificate with invalid characters such as "Â" or "¼"

GVM versions

gvm: gvmd 8.0.1 -> gvmd 9.0.0

Logfiles

md manage:MESSAGE:2019-12-10 10h33.20 utc:45677: check_db_versions: database version of database: 205
md manage:MESSAGE:2019-12-10 10h33.20 utc:45677: check_db_versions: database version supported by manager: 221
md   main:CRITICAL:2019-12-10 10h33.20 utc:45677: gvmd: database is wrong version
md   main:MESSAGE:2019-12-10 10h35.12 utc:45701:    Greenbone Vulnerability Manager version 9.0.0 (DB revision 221)
md   main:   INFO:2019-12-10 10h35.12 utc:45701:    Migrating database.
md   main:   INFO:2019-12-10 10h35.12 utc:45701:    Migrating to 206
md   main:   INFO:2019-12-10 10h35.13 utc:45701:    Migrating to 207
md   main:   INFO:2019-12-10 10h35.13 utc:45701:    Migrating to 208
md   main:   INFO:2019-12-10 10h35.23 utc:45701:    Migrating to 209
md   main:   INFO:2019-12-10 10h35.23 utc:45701:    Migrating to 210
md   main:   INFO:2019-12-10 10h35.24 utc:45701:    Migrating to 211
md   main:   INFO:2019-12-10 10h35.24 utc:45701:    Migrating to 212
md   main:   INFO:2019-12-10 10h35.24 utc:45701:    Migrating to 213
md   main:   INFO:2019-12-10 10h35.24 utc:45701:    Migrating to 214
md manage:WARNING:2019-12-10 10h38.51 utc:45701: sql_exec_internal: PQexec failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xb8

Strace from the migration script:

sendto(4, “P\0\0\0\373\0SELECT rhd.value FROM report_host_details AS rhd JOIN report_hosts ON report_hosts.id = rhd.report_host WHERE name = ‘SSLDetails:C84C5DA49D7FAE53EECFA72DBDA3140004DA036BF7E7A1D79E278FDA1D56F619’ ORDER BY report_hosts.start_time DESC LIMIT 1;\0\0\0B\0\0\0”…, 289, MSG_NOSIGNAL, NULL, 0) = 289
poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, “1\0\0\0\0042\0\0\0\4T\0\0\0\36\0\1value\0\3\310`9\0\7\0\0\0\31\377\377\377\377\377\377\0\0D\0\0\0\221\0\1\0\0\0\207issuer:O=ADVUNI\303\202\302\270\303\203\302\274,OU=Storage,CN=ADVUNI\303\202\302\270\303\203\302\274 Storage Root CA 1|serial:669518|notBefore:20020701T000000|notAfter:20220701T000000C\0\0\0\rSELECT 1\0Z\0\0\0\5T”, 16384, 0, NULL, NULL) = 207
sendto(4, “P\0\0\0\225\0SELECT id FROM tls_certificates WHERE sha256_fingerprint = ‘C84C5DA49D7FAE53EECFA72DBDA3140004DA036BF7E7A1D79E278FDA1D56F619’ AND owner = 1\0\0\0B\0\0\0\16\0\0\0\0\0\0\0\1\0\0D\0\0\0\6P\0E\0\0\0\t\0\0\0\0\0S\0\0\0\4”, 187, MSG_NOSIGNAL, NULL, 0) = 187
poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, “1\0\0\0\0042\0\0\0\4T\0\0\0\33\0\1id\0\3\310a\\0\1\0\0\0\27\0\4\377\377\377\377\0\0C\0\0\0\rSELECT 0\0Z\0\0\0\5T”, 16384, 0, NULL, NULL) = 58
sendto(4, “P\0\0\5\307\0INSERT INTO tls_certificates (uuid, owner, name, comment, creation_time, modification_time, certificate, subject_dn, issuer_dn, trust, activation_time, expiration_time, md5_fingerprint, sha256_fingerprint, serial, certificate_format) SELECT make_”…, 1517, MSG_NOSIGNAL, NULL, 0) = 1517
poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, “E\0\0\0sSERROR\0VERROR\0C22021\0Minvalid byte sequence for encoding “UTF8”: 0xb8\0Fwchar.c\0L2017\0Rreport_invalid_encoding\0\0Z\0\0\0\5E”, 16384, 0, NULL, NULL) = 122
getpid() = 49282
open("/usr/local/var/log/gvm/gvmd.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 6
fstat(6, {st_mode=S_IFREG|0600, st_size=257, …}) = 0
write(6, “md manage:WARNING:2019-12-10 11h33.05 utc:49282: sql_exec_internal: PQexec failed: ERROR: invalid byte sequence for encoding “UTF8”: 0xb8\n (7)\n”, 144) = 144

I checked the database and found out that the entry contained characters that caused the migration script to fail:

Certificate details:
subject …: O=ADVUNI¸ü,OU=Storage,CN=XXXX,CN=10.10.10.10

I ended up removing the results with problematic characters (from results and public.report_host_details table) and the database migration worked. Certificate is also base64 encoded in the public.report_host_details, it also needs to be removed to make it work.

timopollmeier commented 4 years ago

Was the database also migrated from SQLite to PostgreSQL at some point either during the 8.0.1 to 9.0.0 migration or earlier? Other things to check would be the character encoding of the database (which you can by running psql gvmd -c 'SHOW SERVER_ENCODING' as the postgres user) and the locale used to run gvmd.

mettatu commented 4 years ago

Yes, the database was migrated from SQLite when upgrading to 8.0.1 earlier this year. I can confirm that the problematic entries were written originally to the SQLite database.

Database encoding:

 server_encoding 
-----------------
 UTF8
(1 row)

System locale for the process:

LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
timopollmeier commented 4 years ago

I can confirm that the problematic entries were written originally to the SQLite database.

Do you still have a backup of the SQLite database to tell if the entries were already invalid UTF-8 there or if the migration to PostgreSQL may have broken them?

mettatu commented 4 years ago

Unfortunately the original SQLite3 database is already scrapped with the old VMs. I think you are right, something went south during that time.

I tried to reproduce with PostgreSQL but everything worked ok.

1) I dropped the original database and created a fresh empty postgresql with: Greenbone Vulnerability Manager version 8.0.2 (DB revision 205)

2) I scanned the target and confirmed that the database is written with: Subject: O=ADVUNI¸ü,OU=Storage,CN=XXX,CN=10.10.10.10

When checking with OpenSSL, everything is nicely escaped: subject=/CN=10.10.10.10/CN=XXX/OU=Storage/O=ADVUNI\xB8\xFC

3) Database migrated successfully with:

Greenbone Vulnerability Manager 9.0.0
Manager DB revision 221
md   main:MESSAGE:2019-12-12 11h02.43 utc:18082:    Greenbone Vulnerability Manager version 9.0.0 (DB revision 221)
md   main:   INFO:2019-12-12 11h02.43 utc:18082:    Migrating database.
md   main:   INFO:2019-12-12 11h02.43 utc:18082:    Migrating to 206
md   main:   INFO:2019-12-12 11h02.43 utc:18082:    Migrating to 207
md   main:   INFO:2019-12-12 11h02.43 utc:18082:    Migrating to 208
md   main:   INFO:2019-12-12 11h02.47 utc:18082:    Migrating to 209
md   main:   INFO:2019-12-12 11h02.47 utc:18082:    Migrating to 210
md   main:   INFO:2019-12-12 11h02.47 utc:18082:    Migrating to 211
md   main:   INFO:2019-12-12 11h02.47 utc:18082:    Migrating to 212
md   main:   INFO:2019-12-12 11h02.47 utc:18082:    Migrating to 213
md   main:   INFO:2019-12-12 11h02.48 utc:18082:    Migrating to 214
md   main:   INFO:2019-12-12 11h02.48 utc:18082:    Migrating to 215
md   main:   INFO:2019-12-12 11h02.48 utc:18082:    Migrating to 216
md   main:   INFO:2019-12-12 11h02.48 utc:18082:    Migrating to 217
md   main:   INFO:2019-12-12 11h02.48 utc:18082:    Migrating to 218
md   main:   INFO:2019-12-12 11h02.48 utc:18082:    Migrating to 219
md   main:   INFO:2019-12-12 11h02.48 utc:18082:    Migrating to 220
md   main:   INFO:2019-12-12 11h02.48 utc:18082:    Migrating to 221