Closed ghost closed 12 months ago
Isn't i better to store the hash_id as number?
Isn't i better to store the hash_id as number?
TBH I've never thought about that. This uses the same datatype as in the test_results
table. Maybe we could consider updating the hash_id
datatype.
Both Mysql (Mariadb) and Postgresql have good support for JSON so how much gain is it? Are there any performance differences in saving the result in a table instead of in JSON? Are there any performance differences when reading the result?
@pnax, have estimated how time it will take to migrade the database in the zonemster.net server?
have estimated how time it will take to migrade the database in the zonemster.net server?
Yes but I can't find the figure again. If I recall correctly, this was around 2 to 3 hours.
Are there any performance differences when reading the result?
@matsstralbergiis will try to measure the difference.
@pnax, have you compared load of writing with old and new code?
Both Mysql (Mariadb) and Postgresql have good support for JSON so how much gain is it?
See https://github.com/zonemaster/zonemaster-backend/pull/856#issuecomment-1457814137 and https://github.com/zonemaster/zonemaster-backend/pull/1092#pullrequestreview-1526628097. Another gain would be to ease the use of another database engine (for instance Clickhouse, see https://github.com/zonemaster/zonemaster-backend/pull/1094).
Regarding the performance gain, I need to redo some tests. (I thought I had done some a few months back but I can't find my results anymore.)
See #856 (comment) and #1092 (review). Another gain would be to ease the use of another database engine (for instance Clickhouse, see #1094).
The comment in #856 was mine, but today I am better informed and understand that it actually works to let the database engine look into JSON. But it still might be a good idea to use a separate table. It is conceptually simpler. And possibly better support for other engines.
Regarding the performance gain, I need to redo some tests. (I thought I had done some a few months back but I can't find my results anymore.)
Thanks!
I have copied the 'test_result' table to a MySQL instance running in my Mac. The database contains 208 149 tests.
Then I created the new 'result_entries' table (I copied the create command from the MySQL.pm in this PR.) and extracted the data from the JSON blob in the 'test_result' table into the nwe table. It now contains 14 518 570 records.
After that I performed two queries that I believe gets the same result. (batch 10 contains 81689 tests)
One to the 'old' format:
SELECT tr.batch_id, tr.hash_id, tr.domain, result.tag, count(result.tag), result.level, result.module, result.testcase FROM test_results tr, JSON_TABLE( CAST(CONVERT(results USING utf8) AS JSON), "$[*]" COLUMNS( tag VARCHAR(40) PATH '$.tag' ERROR ON ERROR, level VARCHAR(10) PATH '$.level' ERROR ON ERROR, module VARCHAR(20) PATH '$.module' ERROR ON ERROR, testcase VARCHAR(20) PATH '$.testcase' ERROR ON ERROR ) ) as result WHERE batch_id = 10 GROUP BY tr.hash_id, tr.domain, result.tag, result.level, result.module, result.testcase ORDER BY tr.hash_id limit 10;
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
| batch_id | hash_id | domain | tag | count(result.tag) | level | module | testcase |
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | ADDRESSES_MATCH | 1 | INFO | CONSISTENCY | CONSISTENCY05 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | ARE_AUTHORITATIVE | 1 | INFO | DELEGATION | DELEGATION04 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | AXFR_FAILURE | 8 | INFO | NAMESERVER | NAMESERVER03 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B01_CHILD_FOUND | 1 | INFO | BASIC | BASIC01 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B01_PARENT_FOUND | 1 | INFO | BASIC | BASIC01 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B02_AUTH_RESPONSE_SOA | 1 | INFO | BASIC | BASIC02 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CAN_BE_RESOLVED | 1 | INFO | NAMESERVER | NAMESERVER06 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CASE_QUERIES_RESULTS_OK | 1 | INFO | NAMESERVER | NAMESERVER09 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CHILD_DISTINCT_NS_IP | 1 | INFO | DELEGATION | DELEGATION02 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CN04_IPV4_DIFFERENT_PREFIX | 1 | INFO | CONNECTIVITY | CONNECTIVITY04 |
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
10 rows in set, 1 warning (52.8186 sec)
And one to the new format:
SELECT tr.batch_id, tr.hash_id, tr.domain, result.tag, count(result.tag), result.level, result.module, result.testcase FROM result_entries result LEFT JOIN test_results tr ON result.hash_id = tr.hash_id WHERE batch_id = 10 GROUP BY tr.hash_id, tr.domain, result.tag, result.level, result.module, result.testcase ORDER BY tr.hash_id limit 10;
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
| batch_id | hash_id | domain | tag | count(result.tag) | level | module | testcase |
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | ADDRESSES_MATCH | 1 | INFO | CONSISTENCY | CONSISTENCY05 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | ARE_AUTHORITATIVE | 1 | INFO | DELEGATION | DELEGATION04 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | AXFR_FAILURE | 8 | INFO | NAMESERVER | NAMESERVER03 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B01_CHILD_FOUND | 1 | INFO | BASIC | BASIC01 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B01_PARENT_FOUND | 1 | INFO | BASIC | BASIC01 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B02_AUTH_RESPONSE_SOA | 1 | INFO | BASIC | BASIC02 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CAN_BE_RESOLVED | 1 | INFO | NAMESERVER | NAMESERVER06 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CASE_QUERIES_RESULTS_OK | 1 | INFO | NAMESERVER | NAMESERVER09 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CHILD_DISTINCT_NS_IP | 1 | INFO | DELEGATION | DELEGATION02 |
| 10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CN04_IPV4_DIFFERENT_PREFIX | 1 | INFO | CONNECTIVITY | CONNECTIVITY04 |
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
10 rows in set (25.3378 sec)
The results are consistent when repeating the queries.
Maybe it is possible to optimize the queries to speed it up. Feel free to suggest improvements.
If databases are better at handling integer keys, wouldn't it be better to convert the hash_id into an integer instead?
Any x86_64 CPU with the SSE4.2 instruction set extensions (and running a DBMS that has been compiled to support these instructions) can compare strings with the same efficiency as integers (and strings can be seen as arrays of large integers, in a way). It seems to me that discussing the data type of primary keys is a case of premature optimization. There is evidence showing that, for MySQL at least, the performance penalty is quite negligible. I suggest we keep the primary key as VARCHAR(16)
for now.
develop
id
field in result_entries
hash_id
in result_entries
log_level
level
in result_entries
args
to JSONb in PostgreSQLPlease re-review.
~TODO: Travis is failing with the following error on Ubuntu focal (20.04):~
DBD::mysql::db do failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF EXISTS result_entries DROP FOREIGN KEY IF EXISTS fk_hash_id' at line 1 at /home/travis/build/zonemaster/zonemaster-backend/build_dir/blib/lib/Zonemaster/Backend/DB/MySQL.pm line 221. https://app.travis-ci.com/github/zonemaster/zonemaster-backend/jobs/610602246
edit: fixed with commit "Support MariaDB <10.4 and MySQL" (7ead49faee6a722a69e10da05f0a2472f816fa64)
Is field "hash_id" in table "result_entries" the same "hash_id" that identifies the test? If so, does that mean that the table has no unique identifier?
Do you have an estimation of how long time it will take to patch the zonemaster.net database?
Is field "hash_id" in table "result_entries" the same "hash_id" that identifies the test?
Yes.
If so, does that mean that the table has no unique identifier?
Yes, there will be several rows with the same hash_id
in this table (one per msg_tag)
Do you have an estimation of how long time it will take to patch the zonemaster.net database?
I ran the migration script on a copy of the current database (zonemaster.net). It took around 4 hours 15 minutes on my setup.
Some figures on the database date: 2023-09-27 database: PostgreSQL number of tests: 1886151 db size (before migration): 7.3 GB db size (after migration): 22 GB
Do you have an estimation of how long time it will take to patch the zonemaster.net database?
I ran the migration script on a copy of the current database (zonemaster.net). It took around 4 hours 15 minutes on my setup.
Thanks. That sounds manageable.
I've rebased on top of develop to integrate the changes from #1121 and address a comment from @mattias-p. I think I should have then addressed all comments/remarks. Please re-review.
The "How to test this PR" section is not very explicit and doesn't concern itself with the database migration at all, so I made up the following procedure.
age_reuse_previous_test = 60
for good measure.zmtest nu > $HOME/$db-nu.old.result
and save the test id as $test_id_nu.zmtest SE. > $HOME/$db-se.old.result
and save the test id as $test_id_se.sudo systemctl stop zm-rpcapi zm-testagent
.sudo cpanm --notest $HOME/new/*
.cd $(perl -MFile::ShareDir=dist_dir -E 'say dist_dir("Zonemaster-Backend")')
.sudo
perl patch/patch_db_zonemaster_backend_ver_11.0.3.pl`.sudo systemctl start zm-rpcapi zm-testagent
.zmb get_test_results --lang en --test-id $test_id_nu | jq -S > $HOME/$db-nu.migrated.result
.zmb get_test_results --lang en --test-id $test_id_se | jq -S > $HOME/$db-se.migrated.result
.zmtest nu > $HOME/$db-nu.new.result
.zmtest SE. > $HOME/$db-se.new.result
.When following this procedure I found a couple of problems.
For SQLite < 3.32.0 the current version of the schema migration script (develop) is only able to migrate tests with 142 messages or fewer. This is an unreasonably small limit. For SQLite > 3.32.0 the limit is 4680. On Rocky Linux 8.9 we're using SQLite 3.26.0.
I'm not sure if 4680 is a suffient limit. The limit could be set dynamically to allow any number of messages, that isn't high for SQLite itself, of course. You can set the limit something like this: $dbh->sqlite_limit( DBD::SQLite::Constants::SQLITE_LIMIT_SQL_LENGTH, $new_value )
This is bad but I don't think it's a blocker. People who want to keep their test results are probably using MariaDB or PostgreSQL anyway.
We ought to create a tool for migrating databases from one DB engine to another within the same version of Backend.
If you upgrade the database but forget to restart testagent, you get tests with empty results.
This is bad and confusing but I don't think it's a blocker.
After migrating a database from the previous schema version to the new one, get_test_results
no longer reports properly translated message strings. Instead the message strings are returned in "raw" format.
This is because the schema migration script and the RPCAPI daemon disagree on the format of the result_entries.module and result_entries.testcase colums. The daemon reads them case sensitively and expects most of their values to be capitalized. The script writes them in upper case and seems to assume they are case insensitive.
Even though reports of migrated test results are more cryptic to the human reader there is no data loss and proper order can be restored using an additional migration script in a fix release. Or we could treat this as a blocker and fix the migration script before releasing.
Purpose
Move the
test_results.results
json array into a dedicated tableresult_entries
.Context
Replaces #856
Changes
result_entries
andlog_level
result_entries(hash_id) -> test_results(hash_id)
andresult_entries(level) -> log_level(level)
get_test_result
andget_test_history
methods are modified to use the new tableDB::test_result
is now only a getter, all write operations use the new database methodsadd_result_entry
andadd_result_entries
How to test this PR
Credits to @blacksponge for most of the work.