nette / database

💾 A database layer with a familiar PDO-like API but much more powerful. Building queries, advanced joins, drivers for MySQL, PostgreSQL, SQLite, MS SQL Server and Oracle.
https://doc.nette.org/database
Other
512 stars 108 forks source link

Performance of ActiveRow::related() #312

Open MichaelPavlista opened 3 months ago

MichaelPavlista commented 3 months ago

During application performance analysis, I discovered that using the ActiveRow::related() method significantly slows down the application's execution time. The slowdown increases exponentially with the number of calls to this method.

Problem Description

The issue can be replicated with the following simple example, which compares the same logic using ActiveRow::related() versus manually constructed SQL.

Select batch of main and related rows using ActiveRow:

/** @var Explorer $databaseExplorer */
$databaseExplorer = $container->getByType(type: Explorer::class);

// Load batch rows from database.
$companyIdToRow = $databaseExplorer->table(table: 'company')
    ->select('*')
    ->order('id DESC')
    ->limit(BATCH_SIZE)
    ->fetchPairs('id');

/** @var ActiveRow $row */
foreach ($companyIdToRow as $row)
{
    // Load related rows.
    $relatedRows = $row->related('company_member')->fetchAll();
}

Select batch of main and related rows using manually constructed SQL:

// Select main rows.
$companyIdToRow = $databaseExplorer
    ->query('SELECT * FROM `company` ORDER BY `id` DESC LIMIT ?', BATCH_SIZE)
    ->fetchPairs('id');

// Create main rows ID list.
$companyIdList = array_keys($companyIdToRow);

// Select rows from the related table.
$companyIdToMemberRowList = $databaseExplorer
    ->query('SELECT `id`, `company_id` FROM `company_member` WHERE `company_id` IN (?)', $companyIdList)
    ->fetchAssoc('company_id[]->');

foreach ($companyIdToRow as $row)
{
    // Load related rows.
    $relatedRows = $companyIdToMemberRowList[$row['id']] ?? [];
}
Comparison of the variants Number of rows (=BATCH_SIZE) Execution Time [related] Execution Time [manual SQL] How much faster is manual SQL?
1 000 206 ms 130 ms 1.6x
2 500 480 ms 142 ms 3.4x
5 000 1 330 ms 160 ms 8.3x
10 000 4 560 ms 200 ms 22.8x
25 000 26 800 ms 328 ms 81.7x
50 000 134 120 ms 550 ms 243.8x

General Notes

Affected Versions

The issue is present in the latest version nette/database@3.2.2 and also exists in older versions.

Problematic Area

An analysis of the problematic area using Xdebug suggests that the majority of the script's runtime is spent repeatedly calling the SqlBuilder::getConditionHash method.

Expected Behavior

Using ActiveRow::related() should not significantly increase the script's runtime, at least in cases where no modifications are made to the selection before fetching.

# this code should always be fast
$relatedRows = $row->related('company_member')->fetchAll();

Docker Demo

I have prepared a demo of the issue (including a database), which can be easily run in Docker.

Setup

Test URLs:

You can change settings, such as the number of rows (BATCH_SIZE), in bootstrap.local.php.