dmitryd / typo3-realurl

**Vintage** RealURL extension for TYPO3 CMS. Read the wiki if you have questions!
110 stars 127 forks source link

Dysfunctional RealURL cache entries with path segements as parameters (and the fix) #615

Open 0bserver opened 6 years ago

0bserver commented 6 years ago

Problem:

We have tt_news entries with the error no news_id given in SINGLE view. Debugging shows that TYPO3 requests tx_ttnews[tt_news]=the-news-title-as-path-segment instead of tx_ttnews[tt_news]=123. That happens when UrlDecoder->convertAliasToId() tries a reverse lookup from the table tt_news.title and fails. Because of that UrlDecoder->decodeUrlParameterBlockUseAsIs() is executed and creates the aforementioned dysfunctional GETvar with the path segement instead of an id.

Further down the line a cache entry is created where the-news-title-as-path-segment points to a url containing tx_ttnews[tt_news]=the-news-title-as-path-segment (created when the SINGLE view is loaded), but other cache entries with the-news-title-as-path-segment can exist pointing to a url with tx_ttnews[tt_news]=123 (created when the LIST view is loaded). In that case RealURL will choose the first entry it finds and considers valid, but that can be the one with the dysfunctional GETvar.

Solution:

Back to UrlDecoder->convertAliasToId(): We can in fact use a better lookup from the table. $value is split by - and a WHERE clause is created of the form alias_field LIKE '%the%%news%%title%%as%%path%%segment%'.

That takes care of all the commas, hyphens, and other characters that might be in alias_field. Our problem has been solved by that.

While we are much more likely to find a database row this way we might run into false positives when the regular expression fits more than one entry. In that case we can assume that the shortest hit is the right one.

I have created a pull request to take care of the issue. #614

dmitryd commented 6 years ago

Do you know the performance of LIKE?

dmitryd commented 6 years ago

The solution to this problem is simple: do not clear realurl data. tx_ttnews[tt_news]=the-news-title-as-path-segment will only happen in one case: you removed data that realurl requires to decode URLs correctly. So you get consequences for your own actions.

Most likely I will not integrate this solution because performance of LIKE is terrible and this will lower realurl decoding speed. Sorry.

0bserver commented 6 years ago

I'm aware that ''LIKE'' has a perfomance hit. However this is not something that happens on every page load – only if no cache entry can be found. In my example it only happens if somebody opens the SINGLE view before somebody opens the LIST view (because in the latter case the UrlEncoder for all the links to the SINGLE views will run first – correct me if I'm wrong).

Since, as you say, you don't normally clear realurl data, this ''LIKE'' solution will not be in use most of the time. But if you happen to debug a live website and you clear realurl data for any reason, you don't have control over which pages the visitors will open first; will it be LIST or SINGLE? Will they navigate the main menu or will they follow a link to a news from Facebook?

And that's where we need a safety net.