Closed PhilDevProg closed 1 year ago
Somehow some user shortcuts with non-UTF-8 text got into your user database.
One of your user shortcuts seems to expand to
�
3
I wonder how this is possible and how that got into the database.
Can you remember how you which user shortcuts you defined and how?
I think I will add a workaround to ignore non-UTF-8 data in the database as described in:
https://stackoverflow.com/questions/22751363/sqlite3-operationalerror-could-not-decode-to-utf-8-column https://stackoverflow.com/questions/23508153/python-encoding-could-not-decode-to-utf8/23509002#23509002
But it would be interesting to find out how it was possible to write that broken text into the database.
Is it possible that your database contain any secret, private stuff? If not I would be interested in getting a copy of your database to investigate it.
~/.local/share/ibus-typing-booster/user.db
You could make a backup copy of your “broken” user database like this:
cp ~/.local/share/ibus-typing-booster/user.db ~/.local/share/ibus-typing-booster/user.db-backup
and then delete it
rm ~/.local/share/ibus-typing-booster/user.db
and start with an empty database again.
But of course you will loose all the data ibus typing booster learned from your typing that way.
If it is OK for you, you could send me the backup copy of the broken database.
If you don’t want to sent me the database, it might also give us some insights to list the user shortcuts by using sqlite3 like in this example:
$ sqlite3 ~/.local/share/ibus-typing-booster/user.db
SQLite version 3.40.0 2022-11-16 12:10:08
Enter ".help" for usage hints.
sqlite> SELECT input_phrase, phrase FROM phrases WHERE user_freq >= 1000000;
\em|something@somewhere.com
\test|first Line
Second Line
Last line End
\test|foobar
\asdf|hello√world
sqlite>
Although I could not reproduce this yet, I attempted a fix.
Here are test builds for 2.19.11 FOR Fedora 36, 37, and rawhide:
https://copr.fedorainfracloud.org/coprs/mfabian/ibus-typing-booster/builds/
You wrote that you are using Fedora. Can you please try these test builds and check whether they fix the problem for you?
That didn't work, I got nearly the same error message but it said that my disk image is broken or something like that so I removed my old user.db file and now it works! Thank you! Here is my old user.db: user-db.zip
Thank you for sending me the database! It doesn’t seem to contain any secret stuff and it seems to be very short and is a nice test case for a broken database!
When I try to look at it with the sqlite3 command line tool I see problems already:
mfabian@hathi:/tmp
$ sqlite3 user.db
SQLite version 3.40.0 2022-11-16 12:10:08
Enter ".help" for usage hints.
sqlite> select * from phrases;
...
51||||||
4384||�
3||||1.0
101|0|| ha|behab|eich|hal
32|habe einen|habe einen|eni|chhal|1|2.10453995531668e+214
33|hal|Hallo|||1|1654891573.39778
...
45|bin|bin|ich|hallo|1|1654891665.37563
46|m|Mensch|ein|bin|1|1654891668.83022
Runtime error: database disk image is malformed (11)
sqlite>
The line with id 4384 is the line with the broken UTF-8 encoding. The improvement I made in the 2.19.11 test builds avoids that typing booster chokes on that line.
But that database has more problems:
The line with id 51 is completely empty, even the final timestamp is not there.
The line with id 4384 with the broken UTF-8 encoding has a timestamp 1.0, which is weird, the timestap should be the number of seconds since January 1st, 1970. So 1.0 is weird.
The line with id 32 has a timestamp which is impossibly big.
Some lines cannot be selected:
sqlite> select * from phrases where id=32;
sqlite> select * from phrases where id=33;
sqlite>
The line with id 33 looks fine but cannot be selected anyway.
Other lines can be selected:
sqlite> select * from phrases where id=46;
46|m|Mensch|ein|bin|1|1654891668.83022
sqlite>
So yes, this database is broken.
I wonder how this could happen, this has never happened to me so far.
Maybe I should automatically check the database for such errors when typing booster or the typing booster setup tool start and if the database is found to be broken, then print a message to the log file that the database is broken, delete it and start with a new empty database? Or maybe not delete it but move it to user.db.corrupted-2022-12-24 to keep it for investigation what happened?
That would “fix” such problems automatically, but of course starting with a new, empty database means that all the data typing booster learned from the users typing is lost.
But it is probably much better than failing completely and not starting at all.
Maybe I should automatically check the database for such errors when typing booster or the typing booster setup tool start and if the database is found to be broken, then print a message to the log file that the database is broken, delete it and start with a new empty database? Or maybe not delete it but move it to user.db.corrupted-2022-12-24 to keep it for investigation what happened?
I did this now and included it in release 2.19.12:
https://github.com/mike-fabian/ibus-typing-booster/releases/tag/2.19.12
It still contains the code to work around database entries which have invalid UTF-8 encoding like the test build 2.19.11 had.
But on top of that it also tries whether all records in the database are readable.
If trying to read all records raises an exception, then the database is probably damaged.
In that case I create a backup copy of the damaged database and create a new empty database.
This has the same effect you achieved by deleting user.db manually, but it should make this automatic.
I can't open the preferences for ibus-typing-booster in GNOME Settings and from
python3 /usr/share/ibus-typing-booster/setup/main.py
I get this error:I'm using Fedora and version 2.19.10