TYPO3-Solr / ext-solr

A TYPO3 extension that integrates the Apache Solr search server with TYPO3 CMS. dkd Internet Service GmbH is developing the extension. Community contributions are welcome. See CONTRIBUTING.md for details.
GNU General Public License v3.0
136 stars 246 forks source link

Access restricted pages disclosed in sitemaps/menus #540

Open timohund opened 7 years ago

timohund commented 7 years ago

Problem: When solr indexes pages it by default sets a user-group of "0" and fakes a login. Usually this shouldn't be a problem because handling of access restricted content elements is properly implemented. But assume the following case:

Root Page
|-- Page A
|-- Page B
|-- Sitemap
|-- "Lorien" 

Where "Lorien" is an access restricted page (-2) allowing access for any logged in user.

If the pages "Sitemap" gets indexed it will generate the pagetree with the mentioned usergroup of "0" and a faked login. And because of the faked login and this line: https://git.typo3.org/Packages/TYPO3.CMS.git/blob/HEAD:/typo3/sysext/frontend/Classes/Controller/TypoScriptFrontendController.php#l970 also the usergroup "-2" will get added. Now the "Sitemap" will also render the access restricted page "Lorien" and if you type "Lorien" into the solr search you will find the page "Sitemap" and it shows you any (eventually secure) highlighted content around "Lorien".

Solution: Maybe the faked login should be disabled if the currently rendered page is known to not have any access restricted content elements (setting the fe_group to "0" is insufficient).

I could track down the problem to the following issues:

This if statement: https://git.typo3.org/TYPO3CMS/Extensions/solr.git/blob/HEAD:/Classes/IndexQueue/FrontendHelper/PageIndexer.php#l111 will never be true because "stringAccessRootline" will at least contain "c:0". This is caused by the following statement: https://git.typo3.org/TYPO3CMS/Extensions/solr.git/blob/HEAD:/Classes/IndexQueue/PageIndexer.php#l381

It is not possible to filter out the content group "0" in "getAccessGroupsFromContent" because then this statement: https://git.typo3.org/TYPO3CMS/Extensions/solr.git/blob/HEAD:/Classes/IndexQueue/PageIndexer.php#l57 Will always be true for non-restricted pages, which would result in those pages not being indexed at all.

See attached patch for proposed solution. I don't know if my change causes any unwanted side effects.

https://forge.typo3.org/issues/57011 https://review.typo3.org/#/c/28485/

timohund commented 7 years ago

Comment from ingo:

Needs more investigation for a proper solution. Possible solution might be to not render menu CE when indexing (unless they are index/TOC menus of the current page): Hook into menu CE when indexing and disable output.