Kreyu / data-table-bundle

Streamlines creation process of the data tables in Symfony applications. NOT PRODUCTION READY.
https://data-table-bundle.swroblewski.pl
MIT License
75 stars 14 forks source link

Weird pagination behaviour on aggregated columns #120

Closed davidromani closed 3 months ago

davidromani commented 3 months ago

Hi.

I mention my workmate @JSortGarcia because he found a possible cause of this unexpected result. Following the same way to explain the problem I'll copy some code parts & screenshots.

// Controller

public function timeRegisterAnalysisAction(Request $request, TimeRegisterRepository $trr): Response
{
    $dataTable = $this->createDataTable(TimeRegisterAnalysisDataTableType::class, $trr->timeRegisterAnalysisQB());
    $dataTable->handleRequest($request);
    if ($dataTable->isExporting()) {
        return $this->file($dataTable->export());
    }

    return $this->render('@App/admin/reports/time_register_analysis.html.twig', [
        'timeRegisterAnalysisTable' => $dataTable->createView(),
    ]);
}
// Repository

public function timeRegisterAnalysisQB(): QueryBuilder
{
    return $this->createQueryBuilder('tr')
        ->select('b.code')
        ->addSelect('CAST(SUM(tr.durationSeconds) AS UNSIGNED) AS sumTotalDurationSeconds')
        ->join('tr.batch', 'b')
        ->groupBy('b.code')
    ;
}
// DataTableType

public function buildDataTable(DataTableBuilderInterface $builder, array $options): void
{
    $builder
        ->addColumn(
            'code',
            TextColumnType::class,
            [
                'label' => 'code',
                'property_path' => '[code]',
                'sort' => 'code',
                'export' => true,
            ]
        )
        ->addColumn(
            'sumTotalDurationSeconds',
            TextColumnType::class,
            [
                'label' => 'sumTotalDurationSeconds',
                'property_path' => '[sumTotalDurationSeconds]',
                'sort' => 'sumTotalDurationSeconds',
                'export' => true,
            ]
        )
        ->addFilter(
            'code',
            StringFilterType::class,
            [
                'query_path' => 'b.code',
            ]
        )
        ->addFilter(
            'sumTotalDurationSeconds',
            SumTotalWorkstationSecondsFilterType::class,
            [
                'query_path' => 'tr.sumTotalDurationSeconds',
            ]
        )
        ->addExporter('csv', CsvExporterType::class)
        ->addExporter('xlsx', XlsxExporterType::class)
        ->setDefaultSortingData(SortingData::fromArray([
            'b.code' => strtolower(SortOrderEnum::ASC->value),
        ]))
        ->setDefaultPaginationData(new PaginationData(
            page: 1,
            perPage: 25,
        ))
    ;
}

If I put a dd($trr->timeRegisterAnalysisQB()->getQuery()->getResult()) into the controller, just before the DataTable creation, I'm getting this results:

Captura de pantalla 2024-08-01 a les 10 29 38

These results are the expected results, and are fully according to the following Doctrine query:

SELECT b0_.code AS code_0, CAST(SUM(t1_.duration_seconds) AS UNSIGNED) AS sclr_1
FROM time_register t1_ INNER JOIN batch b0_ ON t1_.batch_id = b0_.id AND (b0_.company_id = '5') 
WHERE (t1_.workstation_id IN (SELECT id FROM workstation WHERE workstation.company_id = '5')) 
GROUP BY b0_.code

The problem is the datatable list view is showing a lot of results (more than the 88 expected) and is executing the follwing query:

SELECT b0_.code AS code_0, CAST(SUM(t1_.duration_seconds) AS UNSIGNED) AS sclr_1 
FROM time_register t1_ INNER JOIN batch b0_ ON t1_.batch_id = b0_.id AND (b0_.company_id = '5') 
WHERE (t1_.id IN (92911, 106909, 96845, 105334, 107506, 111142, 101106, 123257, 146432, 123250, 111176, 126683, 134177, 115478, 144833, 115582, 140146, 123359, 124845, 155271, 128890, 127740, 155434, 136840, 133253)) AND (t1_.workstation_id IN (SELECT id FROM workstation WHERE workstation.company_id = '5')) 
GROUP BY b0_.code;

Keep in mind that SELECT id FROM workstation WHERE workstation.company_id = '5' is due to a custom Doctrine filter that we are applying.

We think that the t1_.id IN (92911, 106909, 96845, 105334, 107506, 111142, 101106, 123257, 146432, 123250, 111176, 126683, 134177, 115478, 144833, 115582, 140146, 123359, 124845, 155271, 128890, 127740, 155434, 136840, 133253) part is related with a problem with a cloned query and ONLY_FULL_GROUP_BY SQL mode

JSortGarcia commented 3 months ago

Hello!

Adding to what @davidromani has already exposed, the query that throws error when ONLY_FULL_GROUP_BY sql mode is on, is the following:

SELECT COUNT(DISTINCT t0_.id) AS sclr_0 FROM time_register t0_ INNER JOIN batch b1_ ON t0_.batch_id = b1_.id AND (b1_.company_id = '5') WHERE (t0_.workstation_id IN (SELECT id FROM workstation WHERE workstation.company_id = '5')) GROUP BY b1_.code

And the trace that leads to this query is this:

10  Doctrine\ORM\AbstractQuery->getScalarResult (line 120)
11  Doctrine\ORM\Tools\Pagination\Paginator->count (line 17)
12  Kreyu\Bundle\DataTableBundle\Bridge\Doctrine\Orm\Query\DoctrineOrmResultSetFactory->create (line 72)
13  Kreyu\Bundle\DataTableBundle\Bridge\Doctrine\Orm\Query\DoctrineOrmProxyQuery->getResult (line 615)
14  Kreyu\Bundle\DataTableBundle\DataTable->getResultSet (line 620)
15  Kreyu\Bundle\DataTableBundle\DataTable->createPagination (line 610)
16  Kreyu\Bundle\DataTableBundle\DataTable->getPagination (line 232)
17  Kreyu\Bundle\DataTableBundle\Type\DataTableType->createPaginationView (line 111)
18  Kreyu\Bundle\DataTableBundle\Type\DataTableType->buildView (line 91)
19  Kreyu\Bundle\DataTableBundle\Type\ResolvedDataTableType->buildView (line 89)
20  Kreyu\Bundle\DataTableBundle\Type\ResolvedDataTableType->buildView (line 788)
21  Kreyu\Bundle\DataTableBundle\DataTable->createView (line 80)
22  App\Controller\Admin\Production\TimeRegisterAdminController->timeRegisterAnalysisAction (line 181) 

And the error thrown is this:

An exception occurred while executing a query: SQLSTATE[42000]: Syntax error or access violation: 1055 Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'infont.t0_.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

As we do not add in the queryBuilder this id field, it seems that the Paginator uses the index of the entity the query builder is related to (in this case, ìd in the entity Time Register). Also, the paginator totals seem to relate with the total number of the non aggregated registers.

I hope this may give some useful context! 😃

davidromani commented 3 months ago

@Kreyu can you give it a look, please?

Kreyu commented 3 months ago

Hey @davidromani @JSortGarcia.

This would require creating a small app with similar entities and configuration to check it out, but I don't have enough time for this right now, and have no idea what to do based on the snippets alone.

Currently, I suspect the issue lies on the Doctrine paginator side, and I would suggest checking whether using it directly with your query builder (without data table) creates the same output. Maybe it requires setting some query hints to produce proper results.

For reference, this is how the bundle creates an instance of Doctrine paginator: https://github.com/Kreyu/data-table-bundle/blob/06933bf4345b7c4377b725254f4806a7db3c8754/src/Bridge/Doctrine/Orm/Paginator/PaginatorFactory.php

JSortGarcia commented 3 months ago

Hey @Kreyu,

I created a small app where we can reproduce this behaviour. The repo is this.

To reproduce, you have to configure the connection to a mysql database and then run:

$ php bin/console doctrine:schema:update --force
$ php bin/console doctrine:fixtures:load
$ symfony serve

Here we have a similar case: an entity Category, with a name, and an entity Product, with a sumarizable field salesAmount. The aim of this case is to sumarize salesAmount by category (as is done in the query buider):

App\Repository\ProductRepository;
....
   public function salesAmountSumByCategoryQB(): QueryBuilder
    {
        return $this->createQueryBuilder('p')
            ->select('SUM(p.salesAmount) as salesAmountSum, c.name as categoryName')
            ->join('p.category', 'c')
            ->groupBy('c.name')
            ->orderBy('salesAmountSum', 'DESC');
    }
....

After serving the app, you can find the the route in https://localhost:8000/example_datatable.

There you can see the difference between what is show in the rendered dataTable (above) and the actual values it shoud have (below). Also, there should be only four registers renderd, but the pagination seems to reflect the total number of registers in the Product table.

Note also that in the doctrine configuration I have disabled the only_full_group_by option:

config\packages\doctrine.yaml
...
    dbal:
        options:
            1002: 'SET sql_mode=(SELECT REPLACE(@@sql_mode, "ONLY_FULL_GROUP_BY", ""))'
...

If this option is not set, then this error is thrown:

An exception occurred while executing a query: SQLSTATE[42000]: Syntax error or access violation: 1055 Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'dataTableQueryBuilder.p0_.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

I hope this helps reproducing the error, if we could give you any hand please tell us 😃

Kreyu commented 3 months ago

@JSortGarcia thank you for the repository, I'll let you know if I find a solution

Kreyu commented 3 months ago

Okay, let's start with something relatively easy - why are we grouping on c.name instead of c.id? Could there be multiple categories with the same name? Is that purposeful, or am I just missing the context?

I suggest using the following query instead - works properly, and doesn't require using ONLY_FULL_GROUP_BY at all:

return $this->getEntityManager()->createQueryBuilder()
    ->select('SUM(p.salesAmount) AS salesAmountSum', 'c.name as categoryName')
    ->from(Category::class, 'c')
    ->join('c.products', 'p')
    ->groupBy('c.id')
;

As you can see, now we're querying for categories instead of products, and joining category products instead of product category.

I assume that those entities are just for the example, and they correspond to the entities you're aliasing to b and tr in examples above:

// If this:
return $this->createQueryBuilder('p')
    ->select('SUM(p.salesAmount) as salesAmountSum, c.name as categoryName')
    ->join('p.category', 'c')
    ->groupBy('c.name')
    ->orderBy('salesAmountSum', 'DESC');

// Corresponds to this:
return $this->createQueryBuilder('tr')
    ->select('b.code')
    ->addSelect('CAST(SUM(tr.durationSeconds) AS UNSIGNED) AS sumTotalDurationSeconds')
    ->join('tr.batch', 'b')
    ->groupBy('b.code');

// Can you do something like this instead?
return $this->createQueryBuilder('b')
    ->addSelect('CAST(SUM(tr.durationSeconds) AS UNSIGNED) AS sumTotalDurationSeconds', 'b.code')
    ->join('b.timeRegistries', 'tr')    // Can you join "tr" from "b", like products from category?
    ->groupBy('b.id');    // I assume "b" has "id" since the paginator is using it in the DISTINCT COUNT
;
JSortGarcia commented 3 months ago

Yes, that makes it work! Now the pagination works properly and the aggregation results are coherent.

Thank you very much for your time! 😄

davidromani commented 3 months ago

Thanks for your help @Kreyu