digitalmethodsinitiative / dmi-tcat

Digital Methods Initiative - Twitter Capture and Analysis Toolset
Apache License 2.0
366 stars 114 forks source link

Remove TokuDB Storage Engine #397

Open niczem opened 4 years ago

niczem commented 4 years ago

The TokuDB storage engine has been deprecated by upstream:

TokuDB is deprecated in the 8.0 series and will be supported through the 8.0 series until further notice. This storage engine will not be included in the next major release of Percona Server for MySQL. We recommend MyRocks as a long-term migration path.

Because MariaDB Server includes MyRocks as well, it makes sense to remove TokuDB in favor of MyRocks

https://jira.mariadb.org/browse/MDEV-19780

chris1010010 commented 3 years ago

I'm getting this error message un Ubuntu 18 and TCAT stopped working: PHP Fatal error: Uncaught PDOException: SQLSTATE[42000]: Syntax error or access violation: 1286 Unknown storage engine 'TokuDB' in /var/www/dmi-tcat/capture/common/functions.php:391\nStack trace:\n#0 /var/www/dmi-tcat/capture/common/functions.php(391): PDOStatement->execute()\n#1 /var/www/dmi-tcat/capture/index.php(14): create_admin()\n#2 {main}\n thrown in /var/www/dmi-tcat/capture/common/functions.php on line 391 And this is apparently caused by: Transparent huge pages are enabled, according to /sys/kernel/mm/transparent_hugepage/enabled [ERROR] TokuDB: Huge pages are enabled, disable them before continuing

A solution is described here: https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/ Just replace mongodb.service with mysql.service

alexandreteles commented 3 years ago

The TokuDB storage engine has been deprecated by upstream:

TokuDB is deprecated in the 8.0 series and will be supported through the 8.0 series until further notice. This storage engine will not be included in the next major release of Percona Server for MySQL. We recommend MyRocks as a long-term migration path.

Because MariaDB Server includes MyRocks as well, it makes sense to remove TokuDB in favor of MyRocks

https://jira.mariadb.org/browse/MDEV-19780

TokuDB has been removed from MariaDB on MariaDB 10.6 as Percona themselves are phasing out TokuDB entirely as well. I would +1 MyRocks as it's widely adopted and the recommended replacement for TokuDB for both MariaDB and Percona servers.

One thing I would like to point out is that when MyRocks fails on a disk full scenario this can be as BAD as in all data touched by the query is lost, good luck if you don't have a backup kind of bad, which means that some query work might be required to make sure that TCAT doesn't keep writing in a disk full scenario.

To be completely honest, I would prefer to see TCAT migrating entirely from MySQL and its derivatives to Cassandra, allowing for an easier way to connect other analytics tools into analytics workflows and better write performance considering how write heavy TCAT can be. If running analytics and data mining are the top priorities, and SQL must stay, MonetDB or Ingres/Actian X might be viable options.

I would love to help migrating to whatever the project decides, but the way queries are scattered all over the code makes it hard to follow if you don't have at least some previous knowledge of the codebase. Contributors might want to investigate consolidating queries into a single file or class containing functions that define, run the queries, and returns its results.

alexandreteles commented 2 years ago

Do we have news on this from contributors?

P.S.: #348 might be relevant to this discussion. MyRocks is now available by default in modern MariaDB installations and it's a much easier solution then rewriting the codebase to adopt something like MonetDB. Of course it would break older installations, but I think it's important to move forward with those changes now that Toku is deprecated.