TYPO3-Solr / ext-solr

A TYPO3 extension that integrates the Apache Solr search server with TYPO3 CMS. dkd Internet Service GmbH is developing the extension. Community contributions are welcome. See CONTRIBUTING.md for details.
GNU General Public License v3.0
137 stars 252 forks source link

[BUG] PersistenceEventListener slows down the system since 12.0.4 #4139

Open Raidri opened 3 months ago

Raidri commented 3 months ago

Describe the bug Since the update from 12.0.3 to 12.0.4 I have a massive perfomance Issue if I save an entity with a lot of child entites. a process which takes 5-10~ seconds needs now more than 40-50~ seconds.

I found out that the problem is the DataUpdateHandler in combination with the PersistenceVentListener

In 12.0.3 you used this method to remove an Item from the garbage collector.

 protected function removeFromIndexAndQueueWhenItemInQueue(string $recordTable, int $recordUid): void
    {
        trigger_error(
            'DataUpdateHandler->removeFromIndexAndQueueWhenItemInQueue is deprecated and will be removed in v13.'
            . ' Use DataUpdateHandler->removeFromIndexAndQueue instead.',
            E_USER_DEPRECATED
        );

        if (!$this->indexQueue->containsItem($recordTable, $recordUid)) {
            return;
        }

        $this->removeFromIndexAndQueue($recordTable, $recordUid);
    }

this method checked if the record table is in the indexqueue.

In 12.0.4 you used this method

   /**
     * Removes record from the index queue and from the solr index
     */
    protected function removeFromIndexAndQueue(string $recordTable, int $recordUid): void
    {
        $this->getGarbageHandler()->collectGarbage($recordTable, $recordUid);
    }

this leads to this situation that every child entity in my main entity also has a request to the SOLR server If it was deleted. although they aren't indexed.

For example. I have an article model which is indexed to SOLR. This article model has many filters as child entity which are not in SOLR. If I remove 20 filters each of them do a solr request. The DataUpdateHandler calls the collectGarbage method for the filter too.

Used versions (please complete the following information):

Additional context Add any other context about the problem here.

Raidri commented 3 months ago

Found out that the problem is that my child entites dont find any rootPageIds which lead to that processRecord tries to delete every single item. My storage folder is not into a rootPageId. The buildSiteRootsByObservedPageIds method also doesn't find a rootPageId because the child entity has no indexConfiguration.

  public function handleRecordUpdate(int $uid, string $table): void
    {
        $rootPageIds = $this->getRecordRootPageIds($table, $uid);
        $this->processRecord($table, $uid, $rootPageIds);
    }
Raidri commented 1 month ago

I've new informations about that problem. To get rid of that problem I have to set the useConfigurationMonitorTables settings and have to define which tables should monitored and which ones not.

That is a bit inconvenient.

It would be better if only the table which should indexed from solr will monitored. In my case the article table and not the sys_file_reference or filter table which are child tables of article and don't have any index configurations