Forum is very slow - Githubissues

danielb987 commented 1 year ago

Thank you for your work on MyLittleForum!

We have a MyLittleForum with version 20220803.1 at https://www.jvmv2.se/forum/index.php

The forum is very slow so I did a couple of things to identify why:

I added this code in the beginning of index.php:

// Daniel
ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);
$start_time = hrtime(true);
// Daniel

I added this code at the end of index.php:

echo "<p><hr><p>\n";

function cmp($a, $b)
{
    if ($a->time == $b->time) {
        return 0;
    }
    return ($a->time < $b->time) ? -1 : 1;
}

usort($queries, "cmp");

foreach ($queries as $query) {
    $t = round($query->time/1000/1000,0);
    if ($t >= 10) {
        echo "Time: ".$t."<br>".$query->query."<hr>\n";
    }
}

$time = hrtime(true) - $start_time;
echo "<p>Total time: ".round($time/1000/1000,0)."<p>\n";

I added this code to the end of config/db_settings.php:

$queries = array();

class DanielQuery {
    public $query;
    public $time;
}

function daniel_mysqli_query(mysqli $mysql, string $query, int $result_mode = MYSQLI_STORE_RESULT) {
    global $queries;

//  echo $query."<br>\n";
    $start_time = hrtime(true);
    $result = mysqli_query($mysql, $query, $result_mode);
    $time = hrtime(true) - $start_time;

    $theQuery = new DanielQuery();
    $theQuery->query = $query;
    $theQuery->time = $time;
    $queries[] = $theQuery;

    return $result;
}

And I did a search and replace of all occurrences of @mysqli_query to daniel_mysqli_query.

The result is this. The time is shown in the number of milliseconds and only queries that take 10 seconds or more are shown.

Time: 100
SELECT id, tid FROM mlf2_entries LEFT JOIN mlf2_akismet_rating ON mlf2_akismet_rating.eid = mlf2_entries.id LEFT JOIN mlf2_b8_rating ON mlf2_b8_rating.eid = mlf2_entries.id WHERE ((pid = 0 AND category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)) OR (pid = 0 AND sticky = 2)) AND (mlf2_akismet_rating.spam = 0 AND mlf2_b8_rating.spam = 0) ORDER BY sticky DESC, time DESC LIMIT 0, 30

Time: 222
SELECT COUNT(*) FROM mlf2_entries LEFT JOIN mlf2_akismet_rating ON mlf2_akismet_rating.eid = mlf2_entries.id LEFT JOIN mlf2_b8_rating ON mlf2_b8_rating.eid = mlf2_entries.id WHERE pid = 0 AND (mlf2_akismet_rating.spam = 0 AND mlf2_b8_rating.spam = 0) AND category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)

Time: 788
SELECT COUNT(*) FROM mlf2_entries LEFT JOIN mlf2_akismet_rating ON mlf2_akismet_rating.eid = mlf2_entries.id LEFT JOIN mlf2_b8_rating ON mlf2_b8_rating.eid = mlf2_entries.id WHERE (mlf2_akismet_rating.spam = 1 OR mlf2_b8_rating.spam = 1)

Time: 966
SELECT COUNT(*) FROM mlf2_entries LEFT JOIN mlf2_akismet_rating ON mlf2_akismet_rating.eid = mlf2_entries.id LEFT JOIN mlf2_b8_rating ON mlf2_b8_rating.eid = mlf2_entries.id WHERE category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27) AND (mlf2_akismet_rating.spam = 0 AND mlf2_b8_rating.spam = 0)

Time: 3356
SELECT id, pid, tid, name, user_name, ft.user_id, UNIX_TIMESTAMP(ft.time) AS time, UNIX_TIMESTAMP(ft.time + INTERVAL 0 MINUTE) AS timestamp, UNIX_TIMESTAMP(last_reply) AS last_reply, subject, category, rst.user_id AS req_user FROM mlf2_entries AS ft LEFT JOIN mlf2_userdata ON mlf2_userdata.user_id = ft.user_id LEFT JOIN mlf2_read_entries AS rst ON rst.posting_id = ft.id AND rst.user_id = 0 LEFT JOIN mlf2_akismet_rating ON mlf2_akismet_rating.`eid` = `ft`.`id` LEFT JOIN mlf2_b8_rating ON mlf2_b8_rating.`eid` = `ft`.`id` WHERE (mlf2_akismet_rating.spam = 0 AND mlf2_b8_rating.spam = 0) AND category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27) ORDER BY ft.time DESC LIMIT 40

Total time: 5484

You can with the result on this forum: https://www.jvmv2.se/daniel/forum/ The result is on the bottom of the page.

The conclusion is that several queries take 100 milliseconds or more. And that one query takes several seconds. The solution is probably to add one or more indexes to one or more tables.

auge8472 commented 1 year ago

Thank you for your investigation. I'll take a look at it.

loesler commented 1 year ago

The solution is probably to add one or more indexes to one or more tables.

That is true. However, if you check the table definitions, almost all (relevant) columns are already indexed.

auge8472 commented 1 year ago

I compared all of the mentioned queries with the table definitions. Literally all columns, that are involved in any JOIN and any WHERE clause is actually indexed. So there is nothing we can do at this "front".

In your measurement protocol, three queries stand out as very slow. The outcome is from your forum and a measurement in another forum will result in different numbers but I expect the ratio of the numbers to be at least similar.

`COUNT()` (two affected queries)

First one is the count of spam entries in the selection of accessible categories (788ms). Second one is the corresponding count of non-spam entries with the further contraint of being in the list of accessible categories (966ms). Even the first count query for threads in the category selection of the current user –which I do not want to take into account here– takes 222ms in your measurement.

All my investigations came to one conclusion: With table type InnoDB, count queries takes the time because of checks for transaction safety even there is no other transaction at the time of execution the count query. Table type MyISAM would perform this task much faster but would came at the cost of missing features elsewhere.

All sources recommend to waive the count queries during the normal operations and to use a dedicated table that got filled with the necessary data by INSERT and DELETE triggers on the entries table. We would need additionally UPDATE triggers for the case of spam or ham classification (because the count of remarkable entries/threads could change). The consequence would be to count the number of threads and entries for the different user ranks every time, a thread or entry got created, (re)-classified or deleted but not every time, a main view or thread gets requested.

Additionally there are reports about OR being time consuming. This is in example the case in the 788ms query, where the query limits the data to count having a corresponding value 1 in one table or value 1 in another one (WHERE (mlf2_akismet_rating.spam = 1 OR mlf2_b8_rating.spam = 1)).

A complex task here would be the different user ranks (unregistered visitors, registered users (possibly with different selections of displayed categories), moderators, administrators and (possibly in the future) user groups with access to different category selections) with their different views. This are to many differing preconditions for simple rules if it should work exact. It would be possible to simplify it with counting the items for unregistered visitors, registered users as well as moderators and administrators without taking the users own category selections into account.

Triggers was introduced in MySQL with version 5.0, but with a limited feature set. I never ever worked with triggers in MySQL until now (I only worked with them in MS SQL). So I have absolutely no experience with syntax and performance of MySQL-triggers at the moment.

Entries listings (for thread tree building)

This one is with 3356ms (in your measurement) and with this nearly half of the execution time for database queries obviously the elephant in the room. Even in this query all involved columns are indexed. I would expect this to be (at least in part) a result of the four LEFT JOINs. I have currently no idea for optimisations in this case. 😟

danielb987 commented 1 year ago

@auge8472 Thank you for your investigation! It seems that there is not much to do about it.

auge8472 commented 1 year ago

It seems that there is not much to do about it.

We'll see. In other software products one does not see such blatant performance issues so we have to search for better solutions than we have now.

auge8472 commented 1 year ago

I checked the longest lasting query in phpMyAdmin (not the most convinient tool for this purpose) with EXPLAIN putting EXPLAIN FORMAT=JSON before the keyword SELECT and STRAIGHT_JOIN behind it. I have only my testing and development forums, the largest with only 87 entries, so the performance in itself is satisfying.

EXPLAIN FORMAT=JSON SELECT STRAIGHT_JOIN id, pid, tid …

I had to change the category selection and to adapt the table names to mine but apart from these modifications the query remained the same. Even with a satisfying performance I found a serious issue.

With only three category IDs (two exists in my forum, the third is not existing here) the explain was inconspicuous. The query selected 17 out of the 87 rows from mlf2_entries, joined 17 rows from mlf_userdata, mlf2_read_entries, mlf2_b8_rating and mlf2_akismet_rating. Per scan (selected main row) only one row of the joined tables got "examined". That's what I would expect. The indexes worked, the query found the matching entries from the other tables with the existing indexes.

        {
          "table": {
            "table_name": "mlf24_akismet_rating",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "eid"
            ],
            "key_length": "4",
            "ref": [
              "db53644.ft.id"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 1,
            "filtered": "10.00",
            "cost_info": {
              "read_cost": "17.00",
              "eval_cost": "0.17",
              "prefix_cost": "69.80",
              "data_read_per_join": "13"
            },
            "used_columns": [
              "eid",
              "spam"
            ],
            "attached_condition": "(`db53644`.`mlf24_akismet_rating`.`spam` = 0)"
          }
        }

Changing the category selection to all categories of my forum by adding all IDs to the IN() group changed the situation completely. Now the query selected 86 out of 87 rows from mlf2_entries (there is one row without a category), joined 86 rows from mlf_userdata, mlf2_read_entries, mlf2_b8_rating with only one examined row per scan. But now the query was not able to use the index e_id in the table mlf2_akismet_rating. Because of that the query performed a long lasting full table scan instead.

        {
          "table": {
            "table_name": "mlf24_akismet_rating",
            "access_type": "ALL",
            "possible_keys": [
              "PRIMARY"
            ],
            "rows_examined_per_scan": 86,
            "rows_produced_per_join": 85,
            "filtered": "1.16",
            "using_join_buffer": "hash join",
            "cost_info": {
              "read_cost": "16.20",
              "eval_cost": "8.60",
              "prefix_cost": "293.06",
              "data_read_per_join": "687"
            },
            "used_columns": [
              "eid",
              "spam"
            ],
            "attached_condition": "((`db_name`.`mlf24_akismet_rating`.`eid` = `db_name`.`ft`.`id`) and (`db_name`.`mlf24_akismet_rating`.`spam` = 0))"
          }
        }

For comparision the corresponding block for the join with table mlf24_b8_rating in the query with the 86 resulting lines.

        {
          "table": {
            "table_name": "mlf24_b8_rating",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY",
              "B8_spam"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "eid"
            ],
            "key_length": "4",
            "ref": [
              "db53644.ft.id"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 85,
            "filtered": "100.00",
            "cost_info": {
              "read_cost": "1.00",
              "eval_cost": "8.60",
              "prefix_cost": "302.66",
              "data_read_per_join": "687"
            },
            "used_columns": [
              "eid",
              "spam"
            ],
            "attached_condition": "(`db53644`.`mlf24_b8_rating`.`spam` = 0)"
          }
        }

The big differences are "rows_examined_per_scan": 1 in the table mlf24_b8_rating versus "rows_examined_per_scan": 86 in the table mlf24_akismet_rating and (presumably a result of this difference) the read cost of 1.00 for the table mlf24_b8_rating versus 16.20 for the table mlf24_akismet_rating. Additionally there is a difference between the first run with WHERE categories IN(1, 2, 3) (category 2 does not exist) versus the second run with all categories in the IN-clause. In the first case we can see "attached_condition": "(db_name.mlf24_akismet_rating.spam = 0)", in the second and slow case we see "attached_condition": "((db_name.mlf24_akismet_rating.eid = db_name.ft.id) and (db_name.mlf24_akismet_rating.spam = 0))" (removed the backticks around the table and column names for readability).

Further observations can be read in the project forum.

Can anyone confirm what I have observed? Or does someone sees a completely different outcome?

loesler commented 1 year ago

Hi,

no, I cannot confirm this behavior. Did you check the keys of both tables? If I interchange the JOIN statements b8 <--> akismet, the result changes.

What is happend, if we restrict the JOIN statements instead of using WHERE, i.e.,

SELECT id, pid, tid, name, user_name, ft.user_id, UNIX_TIMESTAMP(ft.time) AS time, UNIX_TIMESTAMP(ft.time + INTERVAL 0 MINUTE) AS timestamp, 
UNIX_TIMESTAMP(last_reply) AS last_reply, subject, category, rst.user_id AS req_user FROM mlf2_entries AS ft 

LEFT JOIN mlf2_userdata            ON mlf2_userdata.user_id = ft.user_id 
LEFT JOIN mlf2_read_entries AS rst ON rst.posting_id = ft.id AND rst.user_id = 0 

LEFT JOIN mlf2_b8_rating           ON mlf2_b8_rating.`eid` = `ft`.`id`      AND mlf2_b8_rating.spam = 0
LEFT JOIN mlf2_akismet_rating      ON mlf2_akismet_rating.`eid` = `ft`.`id` AND mlf2_akismet_rating.spam = 0

WHERE category IN
(0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27) ORDER BY ft.time

Is there a noticeable effect?

/MIcha

auge8472 commented 7 months ago

Long time ago ...

no, I cannot confirm this behavior. Did you check the keys of both tables? If I interchange the JOIN statements b8 <--> akismet, the result changes.

Due to the tests of the reorganised upgrade script, we became aware of missing indexes for the table mlf2_akismet_rating in case of an upgrade. The indexes exists in a forum, that was installed with the current stable release. I expect the missing indexes to be the cause of this observation I made.

auge8472 commented 7 months ago

The reorganisation of the upgrade process[^1] and the corrections to it[^2], which were made possible to a large extent by @joeiacoponi1's tests[^3] - I might not have noticed some things - and @joeiacoponi1's work on improving SQL performance[^4] should at least partially eliminate the cause of this report.

The corretions have not yet been added. I am therefore leaving the issue open, but I would like to state that we are close to a solution.

[^1]: in #680 [^2]: in #707 [^3]: in the project forum and in #708 and #709 [^4]: in #713

auge8472 commented 3 months ago

@danielb987 Even you have decided to move to phpBB with your forum after we wasn't able to solve the issue within a reasonable time, I would like to ask you to upload the new release to the archived MLF instance. It contains noticeable performance improvements. I am curious whether the changes will solve your performance problem of if it is related to another section of the several queries.

There was several changes regarding the way to join the spam-prevention related tables. All of the queries you have listed in your report are joining these tables. So it looks as there could be improvements. And should it only be for those who want to look into the archive, it cound be worth the work.

auge8472 commented 2 days ago

I'm closing here because I no longer expect a response.

My-Little-Forum / mylittleforum

Forum is very slow #673

`COUNT()` (two affected queries)

Entries listings (for thread tree building)

My-Little-Forum / mylittleforum

Forum is very slow #673

COUNT() (two affected queries)

Entries listings (for thread tree building)

`COUNT()` (two affected queries)