dmitryd / typo3-realurl

**Vintage** RealURL extension for TYPO3 CMS. Read the wiki if you have questions!
110 stars 127 forks source link

UrlDecoder problem with multibyte languages #578

Open franzkugelmann opened 6 years ago

franzkugelmann commented 6 years ago

In case the corresponding cache entry in tx_realurl_pathdata is still empty or was deleted, then realurl currently is unable to reverse engineer the page id from an multibyte URL like chinese characters. The reason is that in UrlDecoder in line 1061 the php-function pathinfo is used. This function needs the correct locale to be able to handle multibyte strings. Unfortunately this is set in TYPO3 (via \TYPO3\CMS\Frontend\Controller\TypoScriptFrontendController::settingLocale()) only after the url decoding is done.

dmitryd commented 6 years ago

Yes, this may happen. This is why it is recommended to never touch realurl tables. Decoder's reverse lookup for urls is not a full solution, it is emergency solution for cases when cache is missing. It never can replace properly encoded urls from the database entries.

jdoubleu commented 6 years ago

I noticed a similar problem when dealing with chinese characters.

RealURL was able to find the correct page uid to the corresponding (chinese) page title when the caching tables were empty.

But this caused a bit different issue: The UrlEncoder encodes the urls. Chinese characters are then encoded (e.g. %68) in the urls. After url generation these links are written to the tx_realurl_urldata table. On the decoding side (UrlDecoder), if there's no cache entry yet, the URL isn't encoded, but instead the original characters are written to the database. Another lookup by the UrlEncoder would not find any results for the given page uid. This caused some inconsistent database entries and buggy url routing.

dmitryd commented 6 years ago

Could you check if branch bugfix/578 fixes the issue for Chinese?

franzkugelmann commented 6 years ago

Hi Dmitry, thanks for your time and efforts! Does not fix the issue for not having an entry in realurl caches (for whatever reason) when the URL is called, as https://github.com/dmitryd/typo3-realurl/pull/579/commits/c171d1d3b2bc7c8f5de4b1556a53ca29fb24bea8 would. I realize that you say "it is recommended to never touch realurl tables" so maybe there is no intention to fix that kind of situation.

s2b commented 6 years ago

I'm currently experiencing the same problem. From my point of view, a solution could be to set the locale to $TYPO3_CONF_VARS['SYS']['systemLocale'] before calling pathinfo() and to reset it afterwards. In fact, there is even a wrapper function provided by the core that does exactly that:


TYPO3\CMS\Core\Utility\PathUtility::pathinfo()
franzkugelmann commented 6 years ago

Hi s2b, that looks like a great solution to me. Tested and fixes my issue. Adapted my pull request accordingly.

felixrupp commented 5 years ago

Hi, I think a have a problem related to what @jdoubleu described in a TYPO3 8.7 LTS installation with chinese translations. The chinese URL-data in the table is generated wrong. I either have urlencoded URL-data entries like "zh/%E9%9A%90%E7%A7%81%E6%94%BF%E7%AD%96.html" or simply the fallback language (L=0, german in this case) name of the page like "zh/datenschutz.html".

Because of that, the rendered menu entries (tried TS menu and VHS menu-viewhelper) link to the default L=0 page, "/datenschutz.html" in this case, if I am on the chinese language version. For the frontend user it is not possible to get to the correct chinese version of the pages.

On english (L=1 in my case) everything works fine.

@franzkugelmann Your fix did not help. I patched the UrlDecoder as you did in your pull request and deleted the wrong url-data entries. Is there something I forgot about?

The 'languageExceptionUids' option is set and seems to work, but the links in my navigation are still rendered wrong.

EDIT: Fixed my issue with wrong nav links. The TS config.linkVars was not set up correctly.

Thanks and kind regards