aimeos / ai-controller-jobs

Aimeos e-commerce job controllers
https://aimeos.org
GNU Lesser General Public License v3.0
28 stars 17 forks source link

Slicing issue in sitemaps #13

Closed aimeos closed 5 years ago

aimeos commented 5 years ago

While testing I did find some oddities with the logic which creates multiple files based on 'max-items' and 'max-query' in \Aimeos\Controller\Jobs\Product\Export\Sitemap\Standard.php::export(). The test exports 8 items with 5 in the first sitemap file, 3 in the second, as expected, but also another third empty sitemap file. When you change 'max-items' to 6, the first file still only contains 5 items. My class is based on this so the behaviour is the same.

The last empty file is probably due to creating a content item if there are no results any more: https://github.com/aimeos/ai-controller-jobs/pull/12/files#diff-6e026c9e50bd1a0a33f93cc4651ef1dfR362

If 'max-items' is 6, the first file should contain 6 items and the second 2 (maybe with an empty third file too). Maybe this was still the content of the old file or the configuration is still cached somewhere?

Regarding the urls per file and number of files. Adding to the already discussed problems controller/jobs/product/export/sitemap/max-items can never be smaller then controller/jobs/product/export/sitemap/max-query as only after all queried items are written at once through addItems() a check is executed an possibly a new file is created.

That's usually not a problem so we can keep it as is.

aimeos commented 5 years ago

@s-diez Do you have an elegant solution for the emtpy file problem?

s-diez commented 5 years ago

I would execute the first searchItems with a query for the total count and compare the exported items to it. I think this could simplify the logic. Additionally I would add a note about the relationship between 'max-items' and 'max-query' to the config description.

aimeos commented 5 years ago

Maybe this is a good solution:

        do
        {
            $items = $manager->searchItems( $search, $domains );
            $free = $maxItems * $filenum - $start;
            $count = count( $items );

            if( $free < $count )
            {
                $this->addItems( $content, array_slice( $items, 0, $free, true ) );
                $items = array_slice( $items, $free, null, true );

                $this->closeContent( $content );
                $content = $this->createContent( $container, ++$filenum );
                $names[] = $content->getResource();
            }

            $this->addItems( $content, $items );

            $start += $count;
            $search->setSlice( $start, $maxQuery );
        }
        while( $count >= $search->getSliceSize() );

Yes, pointing readers in the docs to the fact that max-query must be smaller than max-items is a good idea :-)

aimeos commented 5 years ago

@s-diez What do you think about the suggested code? Does it work for you?

s-diez commented 5 years ago

I did run the unit tests with a couple of different 'max-query' and 'max-items' combinations. Seems to work fine.

aimeos commented 5 years ago

Cool! Changes are added to the repo: https://github.com/aimeos/ai-controller-jobs/commit/30b2d49f7baaa06194204fd209eb1bebf8937dd6