joomla / joomla-cms

Home of the Joomla! Content Management System
https://www.joomla.org
GNU General Public License v2.0
4.77k stars 3.65k forks source link

[5.2] Smart Search: Fix grouping of fields #44492

Open Hackwar opened 2 days ago

Hackwar commented 2 days ago

Summary of Changes

Smart Search currently can fail under certain conditions when two terms would actually be handled as identical due to collation, but because of a bug are treated as 2 separate terms. This results in the unique index for term, language being violated for the terms table. This PR fixes the GROUP BY statement.

Testing Instructions

Unfortunately, this is rather difficult to reproduce. I had content which contained the words messsystem and meßsystem, which triggered the problem on one server, but then again I can't reproduce it locally. So... Codereview?

Actual result BEFORE applying this Pull Request

Expected result AFTER applying this Pull Request

Link to documentations

Please select:

Hackwar commented 1 day ago

Some more explanation: This problem results in an exception which at least aborts the mass-indexing in Smart Search and potentially also creates a fatal error during saving of the content in question. While content is already saved, it is still rather bad that we get a fatal error during that process. I've encountered this on several occasions in the past, especially in combination with Falang (which probably mainly is because sites with Falang have more non-english content) Unfortunately it isn't as simple as putting those similar words into an article and saving it. I couldn't find out how to reproduce the problem or how to create a minimal example.

The main problem seems to be, that the GROUP BY differentiates the terms based on (among other things) the weight and thus thinks that two terms are different, even though they are actually identical by collations standards.