mumble-voip / mumble

Mumble is an open-source, low-latency, high quality voice chat software.
https://www.mumble.info
Other
6.9k stars 1.17k forks source link

Mysql schema error #6782

Open Challado opened 1 month ago

Challado commented 1 month ago

The issue

I have upgraded from latest version on git and ocurring the follow error: /usr/local/bin/mumble-server -v -fg -ini /etc/mumble-server.ini

2025-04-14 09:21:48.664 SSL: OpenSSL version is 'OpenSSL 3.5.0 8 Apr 2025' 2025-04-14 09:21:48.664 Initializing settings from /etc/mumble-server.ini (basepath /etc) 2025-04-14 09:21:48.665 MetaParams: Adding 1 intermediate certificates from certificate file. 2025-04-14 09:21:48.666 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA" 2025-04-14 09:21:48.666 Successfully switched to uid 1008 2025-04-14 09:21:48.666 Resource limits were 0 0 2025-04-14 09:21:48.666 Successfully dropped capabilities [ERROR]: Unknown column 'meta_value' in 'SELECT' while executing "SELECT "meta_value" FROM meta WHERE "meta_key" = :key" with :key="scheme_version". I have dropped and recreate database to see what's happening, and, as I see, have a substantial change in mysql schema, but no one tool to migrate to these new schema. How I can migrate to these new schema? Version is the same: 1.6.0 but these change is unusual. My version is compiled from source, and never ocurr these error in past. ### Mumble version 1.6.0 ### Mumble component Server ### OS Linux ### Additional information _No response_
Krzmbrzl commented 1 month ago

Seems like I forgot to implement migration functionality for the meta table itself. The keys used to be called keystring and value: https://github.com/mumble-voip/mumble/blob/88f25595b669899d8096f8624273f27142614359/src/murmur/ServerDB.cpp#L293-L300

For reference, what's missing is something like https://github.com/mumble-voip/mumble/blob/26a266c3a990f2159ef4f59a5767f71df07dbdaf/src/murmur/database/ChannelPropertyTable.cpp#L176-L198

Challado commented 1 month ago

Some observations: 1 - The value in actual table isn't "scheme_version", is "version" 2 - Another tables (as I see in code) have some changes, like server as virtual_server, etc

These changes impact enormously on migration from older to these new version.

Krzmbrzl commented 1 month ago

These changes impact enormously on migration from older to these new version.

Of course. But users don't have to worry about any of that as the server will self-migrate the database (as it always has).

dexgs commented 2 weeks ago

Even after manually renaming the columns in the meta table, I get [ERROR]: sqlite3_statement_backend::prepare: table "config" already exists while preparing "CREATE TABLE "config" ("server_id" INTEGER NOT NULL, "config_name" VARCHAR(255) NOT NULL, "config_value" TEXT NOT NULL, PRIMARY KEY ("server_id", "config_name"), FOREIGN KEY ("server_id") REFERENCES "virtual_servers"("server_id") ON UPDATE CASCADE ON DELETE CASCADE)".

It seems like migrations to the latest database schema have some issues.

dexgs commented 2 weeks ago

If I go on to remove the config table (it can be re-populated from my config file, and I don't have any virtual server configuration I'm worried about losing), then I get an error about the channels table already existing. It looks like the migrations aren't running at all.

dexgs commented 2 weeks ago

To be more specific, It looks like the mumble server isn't trying to run the migrations and is instead trying to populate the database as if it were empty, but this is failing because many of the tables already exist.

Challado commented 1 week ago

To be more specific, It looks like the mumble server isn't trying to run the migrations and is instead trying to populate the database as if it were empty, but this is failing because many of the tables already exist.

Yep, exactly, and I have a server with lots of config, it's too expensive re-create / re-migrate all to new installation. Waiting a viable solution to migrate to these new version.

hamilton5 commented 1 week ago

Renaming the columns in the meta table, and changing version to scheme_version I get this now:

[ERROR]: Failed at renaming table: sqlite3_statement_backend::prepare: table sqlite_sequence may not be altered while preparing "ALTER TABLE "sqlite_sequence" RENAME TO "sqlite_sequence_old"".

I thought it had to do with using autoincrement. I found these two files interesting. Notice there is no autoincrement on server_id and group_id anymore.

https://raw.githubusercontent.com/mumble-voip/mumble/9ce0bce62206de69edc9f7513770f8e671621f3a/src/murmur/Tables https://raw.githubusercontent.com/mumble-voip/mumble/9ce0bce62206de69edc9f7513770f8e671621f3a/src/murmur/NewTables

I tried unsetting autoincrement on both, and running delete from sqlite_sequence where name= on the two tables, but not successful I'm still getting the same error. I read "application code cannot drop the sqlite_sequence table". So this is probably a separate issue but also related to migrating tables: https://github.com/mumble-voip/mumble/blob/4c60c33e60658a1f0b85f40677907029edea621b/src/database/Database.cpp#L722

I did this in that code and now I get:

[ERROR]: Failed at migrating table "channel_properties" from scheme version 9 to 10

const std::string mytest = "sqlite_sequence";

if ((currentTableName.compare(mytest)) == 0)
    continue;
Krzmbrzl commented 5 days ago

FYI: I am now actively investigating this issue and working on a fix.

EDIT: I mistyped not instead of now which completely changed the meaning of what I wanted to say. I really mean that I do work on it though πŸ‘†

Krzmbrzl commented 5 days ago

Please check out https://github.com/mumble-voip/mumble/pull/6806 and let me know whether this fixes the database migration issues you are seeing.

Challado commented 5 days ago

Please check out #6806 and let me know whether this fixes the database migration issues you are seeing.

Error still continue, but without details. ./mumble-server -fg -v

2025-05-10 17:47:55.013 SSL: OpenSSL version is 'OpenSSL 3.5.0 8 Apr 2025' 2025-05-10 17:47:55.013 Initializing settings from /etc/mumble-server.ini (basepath /etc) 2025-05-10 17:47:55.015 MetaParams: Adding 1 intermediate certificates from certificate file. 2025-05-10 17:47:55.015 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA" 2025-05-10 17:47:55.016 Successfully switched to uid 1008 2025-05-10 17:47:55.016 Resource limits were 0 0 2025-05-10 17:47:55.016 Successfully dropped capabilities [ERROR]: Failed at migrating table "channel_properties" from scheme version 9 to 10
Krzmbrzl commented 4 days ago

~~Could you send me an excerpt of your channel properties table? That would facilitate debugging. ~~

Krzmbrzl commented 4 days ago

Scratch that - I have updated #6806 and now the underlying error should be printed as well. So please pull the latest changes and run the migration again. This time, the error message should be more informative.

hamilton5 commented 4 days ago

[ERROR]: Exiting due to unhandled exception: Failed at migrating table "channel_properties" from scheme version 9 to 10 sqlite3_statement_backend::loadOne: database disk image is malformed while executing "INSERT INTO "channel_properties" ("server_id", "channel_id", "property_key", "property_value") SELECT "server_id", "channel_id", "key", value FROM "channel_info_old"".

my channel_info table: Image

Image

Krzmbrzl commented 4 days ago

Hm weird. Could you try the steps described in https://stackoverflow.com/q/5274202 to see if that fixes the issue? :thinking:

hamilton5 commented 4 days ago

Hm weird. Could you try the steps described in https://stackoverflow.com/q/5274202 to see if that fixes the issue? πŸ€”

I did run a pragma integrity_check; and it found page 23 to be empty, so then I ran vaccum, now the integrity_check comes back OK. And it looks like the migration issue is solved. Thanks.

I am getting this now Registration needs nonempty 'registername', 'registerpassword' and 'registerurl', must have an empty 'password' and allowed pings. My config did not change, and when I tried to set the names as shown (no caps) I got Not registering server as public. It has something to do with the qurlRegWeb.isValid() check and setting registerUrl=http://www.mumble.info/ still failed. Ignoring that check resulted in: Registration failed: Error transferring https://publist-registration.mumble.info/v1/register - server replied: Bad Request

2nd thing that's different is the server is listening on port + 1 now - I figured this one out but it was unexpected: https://github.com/mumble-voip/mumble/commit/128f4ec3397325b8c657cf9fdd7335ffc94e305d

Challado commented 2 days ago

Scratch that - I have updated #6806 and now the underlying error should be printed as well. So please pull the latest changes and run the migration again. This time, the error message should be more informative.

My database is mysql. Upgraded to PR specified doesn't increase output. Output still continues as follow: ./mumble-server -fg -v

2025-05-13 09:50:03.532 SSL: OpenSSL version is 'OpenSSL 3.5.0 8 Apr 2025' 2025-05-13 09:50:03.533 Initializing settings from /etc/mumble-server.ini (basepath /etc) 2025-05-13 09:50:03.534 MetaParams: Adding 1 intermediate certificates from certificate file. 2025-05-13 09:50:03.534 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA" 2025-05-13 09:50:03.535 Successfully switched to uid 1008 2025-05-13 09:50:03.535 Resource limits were 0 0 2025-05-13 09:50:03.535 Successfully dropped capabilities [ERROR]: Failed at migrating table "channel_properties" from scheme version 9 to 10
Krzmbrzl commented 2 days ago

Upgraded to PR specified doesn't increase output

@Challado dumb question but did you also recompile? 'cause I can't find any code path that outputs [ERROR]: that would result in the message you are seeing πŸ‘€

$ find -type f -exec fgrep -n "[ERROR]" {} /dev/null \;
./database/ConversionUtils.cpp:37:                                      std::cerr << "[ERROR]: Tried to convert unsupported DB datatype to string: "
./database/ConversionUtils.cpp:91:                                      std::cerr << "[ERROR]: Tried to convert unsupported DB datatype to JSON: " << props.get_data_type()
./mumble/API_v_1_x_x.cpp:101:           printf("[ERROR]: Plugin with ID %d leaked memory from a call to API function \"%s\"\n", entry.m_pluginID,
./murmur/DBWrapper.cpp:99:              std::cerr << "[ERROR]: Encountered database error:" << std::endl; \
./murmur/main.cpp:629:          std::cerr << "[ERROR]: Exiting due to unhandled exception:\n";
./murmur/main.cpp:633:          std::cerr << "[ERROR]: Caught unknown error (this is a bug, please report it)" << std::endl;

Just to be sure though, I extended the try block that should now catch more errors.


@hamilton5

2nd thing that's different is the server is listening on port + 1 now

yeah, I noticed that as well. I am currently working on a fix for that.

I am getting this now Registration needs nonempty 'registername', 'registerpassword' and 'registerurl', must have an empty 'password' and allowed pings. My config did not change, and when I tried to set the names as shown (no caps) I got Not registering server as public.

Alright, thanks for letting me know. I'll look into that.

Challado commented 2 days ago

Upgraded to PR specified doesn't increase output

@Challado dumb question but did you also recompile? 'cause I can't find any code path that outputs [ERROR]: that would result in the message you are seeing πŸ‘€

$ find -type f -exec fgrep -n "[ERROR]" {} /dev/null \;
./database/ConversionUtils.cpp:37:                                      std::cerr << "[ERROR]: Tried to convert unsupported DB datatype to string: "
./database/ConversionUtils.cpp:91:                                      std::cerr << "[ERROR]: Tried to convert unsupported DB datatype to JSON: " << props.get_data_type()
./mumble/API_v_1_x_x.cpp:101:           printf("[ERROR]: Plugin with ID %d leaked memory from a call to API function \"%s\"\n", entry.m_pluginID,
./murmur/DBWrapper.cpp:99:              std::cerr << "[ERROR]: Encountered database error:" << std::endl; \
./murmur/main.cpp:629:          std::cerr << "[ERROR]: Exiting due to unhandled exception:\n";
./murmur/main.cpp:633:          std::cerr << "[ERROR]: Caught unknown error (this is a bug, please report it)" << std::endl;

Just to be sure though, I extended the try block that should now catch more errors.

@hamilton5

2nd thing that's different is the server is listening on port + 1 now

yeah, I noticed that as well. I am currently working on a fix for that.

I am getting this now Registration needs nonempty 'registername', 'registerpassword' and 'registerurl', must have an empty 'password' and allowed pings. My config did not change, and when I tried to set the names as shown (no caps) I got Not registering server as public.

Alright, thanks for letting me know. I'll look into that.

Really, my fault. ./mumble-server -fg -v

2025-05-13 14:12:20.282 SSL: OpenSSL version is 'OpenSSL 3.5.0 8 Apr 2025' 2025-05-13 14:12:20.282 Initializing settings from /etc/mumble-server.ini (basepath /etc) 2025-05-13 14:12:20.284 MetaParams: Adding 1 intermediate certificates from certificate file. 2025-05-13 14:12:20.285 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA" 2025-05-13 14:12:20.285 Successfully switched to uid 1008 2025-05-13 14:12:20.285 Resource limits were 0 0 2025-05-13 14:12:20.285 Successfully dropped capabilities 2025-05-13 14:12:21.873 Performed initial PBKDF2 benchmark. Will use 32000 iterations as default 2025-05-13 14:12:21.874 Murmur 1.6.0 running on Linux: Debian GNU/Linux trixie/sid [x64]: Booting servers 2025-05-13 14:12:21.902 Created new server default instance 2025-05-13 14:12:21.913 0 => Server listening on 0.0.0.0:64738 2025-05-13 14:12:21.916 0 => Registration needs nonempty 'registername', 'registerpassword' and 'registerurl', must have an empty 'password' and allowed pings. Success on migrate from old schema now.
hamilton5 commented 2 days ago

Same error for me as before, still needs manual intervention; A vacuum to clean up the empty page.

[ERROR]: Exiting due to unhandled exception: Failed at migrating table "channel_properties" from scheme version 9 to 10 sqlite3_statement_backend::loadOne: database disk image is malformed while executing "INSERT INTO "channel_properties" ("server_id", "channel_id", "property_key", "property_value") SELECT "server_id", "channel_id", "key", value FROM "channel_info_old"".

This is what is shown when I do pragma integrity_check; before the vacuum:

in database main Freelist: size is 538 but should be 539 Page 23: never used

Challado commented 2 days ago

Upgraded to PR specified doesn't increase output

@Challado dumb question but did you also recompile? 'cause I can't find any code path that outputs [ERROR]: that would result in the message you are seeing πŸ‘€

$ find -type f -exec fgrep -n "[ERROR]" {} /dev/null \;
./database/ConversionUtils.cpp:37:                                      std::cerr << "[ERROR]: Tried to convert unsupported DB datatype to string: "
./database/ConversionUtils.cpp:91:                                      std::cerr << "[ERROR]: Tried to convert unsupported DB datatype to JSON: " << props.get_data_type()
./mumble/API_v_1_x_x.cpp:101:           printf("[ERROR]: Plugin with ID %d leaked memory from a call to API function \"%s\"\n", entry.m_pluginID,
./murmur/DBWrapper.cpp:99:              std::cerr << "[ERROR]: Encountered database error:" << std::endl; \
./murmur/main.cpp:629:          std::cerr << "[ERROR]: Exiting due to unhandled exception:\n";
./murmur/main.cpp:633:          std::cerr << "[ERROR]: Caught unknown error (this is a bug, please report it)" << std::endl;

Just to be sure though, I extended the try block that should now catch more errors. @hamilton5

2nd thing that's different is the server is listening on port + 1 now

yeah, I noticed that as well. I am currently working on a fix for that.

I am getting this now Registration needs nonempty 'registername', 'registerpassword' and 'registerurl', must have an empty 'password' and allowed pings. My config did not change, and when I tried to set the names as shown (no caps) I got Not registering server as public.

Alright, thanks for letting me know. I'll look into that.

Really, my fault. ./mumble-server -fg -v 2025-05-13 14:12:20.282 SSL: OpenSSL version is 'OpenSSL 3.5.0 8 Apr 2025' 2025-05-13 14:12:20.282 Initializing settings from /etc/mumble-server.ini (basepath /etc) 2025-05-13 14:12:20.284 MetaParams: Adding 1 intermediate certificates from certificate file. 2025-05-13 14:12:20.285 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA" 2025-05-13 14:12:20.285 Successfully switched to uid 1008 2025-05-13 14:12:20.285 Resource limits were 0 0 2025-05-13 14:12:20.285 Successfully dropped capabilities 2025-05-13 14:12:21.873 Performed initial PBKDF2 benchmark. Will use 32000 iterations as default 2025-05-13 14:12:21.874 Murmur 1.6.0 running on Linux: Debian GNU/Linux trixie/sid [x64]: Booting servers 2025-05-13 14:12:21.902 Created new server default instance 2025-05-13 14:12:21.913 0 => Server listening on 0.0.0.0:64738 2025-05-13 14:12:21.916 0 => Registration needs nonempty 'registername', 'registerpassword' and 'registerurl', must have an empty 'password' and allowed pings.

Success on migrate from old schema now.

Sorry for my mistake. I won't see that config is wrong and system have created a new database.

I have dropped new DB, import info again from old backup and error still occurs, as follows. ./mumble-server -fg -v

2025-05-13 15:05:13.735 SSL: OpenSSL version is 'OpenSSL 3.5.0 8 Apr 2025' 2025-05-13 15:05:13.735 Initializing settings from /etc/mumble-server.ini (basepath /etc) 2025-05-13 15:05:13.737 MetaParams: Adding 1 intermediate certificates from certificate file. 2025-05-13 15:05:13.737 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA" 2025-05-13 15:05:13.738 Successfully switched to uid 1008 2025-05-13 15:05:13.738 Resource limits were 0 0 2025-05-13 15:05:13.738 Successfully dropped capabilities [ERROR]: Exiting due to unhandled exception: Failed at migrating table "channel_properties" from scheme version 9 to 10 Column 'property_value' cannot be null while executing "INSERT INTO "channel_properties" ("server_id", "channel_id", "property_key", "property_value") SELECT "server_id", "channel_id", "key", value FROM "channel_info_old"".
Krzmbrzl commented 2 days ago

@Challado you seem to be having a NULL entry in the channel_info table. You'll have to clean that up manually before migrating.

@hamilton5 The integrity error is something that is outside of Mumble's scope, I am afraid. No idea how this can happen but I don't think Mumble can do anything about it.

hamilton5 commented 2 days ago

Oh, I thought it would be as easy as running the vacuum call first. Could it break something maybe, If you can check for NULL entries, only do it then? My db file is very old, It went from 2289 to 1732 KB. I assume @Challado has the same problem.

Challado commented 2 days ago

@Challado you seem to be having a NULL entry in the channel_info table. You'll have to clean that up manually before migrating.

@hamilton5 The integrity error is something that is outside of Mumble's scope, I am afraid. No idea how this can happen but I don't think Mumble can do anything about it.

Really I have a NULL on field. Solved and migration is now consistent, BUT, server refuses to listen on port defined on /etc/mumble-server.ini (64378). Only listening on 64379.

Krzmbrzl commented 2 days ago

I thought it would be as easy as running the vacuum call first. Could it break something maybe, If you can check for NULL entries, only do it then?

Maybe. But truth is, I have no idea and I prefer to err on the side of caution. I don't want to automatically issue commands that may remove data.

BUT, server refuses to listen on port defined on /etc/mumble-server.ini (64378). Only listening on 64379.

Yeah, that's the issue with server IDs now starting at zero instead of one. I'm working on a fix for that.