tpwd / ke_search

Search Extension for TYPO3 Content Management System, including faceting search functions.
https://extensions.typo3.org/extension/ke_search/
GNU General Public License v3.0
8 stars 31 forks source link

Feature request: cachable search results #188

Closed lksnmnn closed 1 year ago

lksnmnn commented 1 year ago

Hi there,

thank you for your work so far!

I was wondering, if anything speaks against making the plugins cachable. It would greatly improve the UX for repeated searches and pages where content does not change a lot. If I see correctly, you already added a cachable result plugin to the premium version, albeit it being only for headless API responses?

I am trying to see if I can monkey patch the code to make it cachable, but so far no luck getting the chash parameter added.

With my limited insights into Typo3 right now, I see the following tasks:

Or am I missing something? At least I want to be able to allow the page to be stored in the browser for a short while. Of course ideally the above, to get it to work together with staticfilecache.

Happy to help.

mbrodala commented 1 year ago

A possible solution would be the PRG pattern. This way the search form would initially be submitted via POST instead of GET:

https://github.com/tpwd/ke_search/blob/3fcfb412bdc67383b474b2ced32836ca60abca9c/Resources/Private/Templates/SearchForm.html#L11

After a server-side redirect you'd end up at an URL with your search parameters in the URL and a cHash.

Bonus: if you add some routing configuration for this, you could even automatically get a nicer URL this way without cHash and still have proper caching.

lksnmnn commented 1 year ago

Thank you for your input. However, I do not see PRG helping me here, since the search form request from ke_search is already using GET-Requests and not POST-Requests. They are not being cached by Typo3 due to the way, the plugins are registered (USER_INT).

In my case right now, I want to have the form (a filter selection) and the search results on the same page.

As for routing configuration: this confuses me the most, because if I understand correctly one would need to add all possible filters and filter options in there to get it to work. However, the editor could introduce new filters which breaks this or requires a developer to update the config?

derhansen commented 1 year ago

I suggest to not make search results cachable, as this leads to Denial Of Service scenarios with TYPO3s cache tables. Since various search terms can be provided to the extension, it will be easy to automatically generate a huge amount of cache entries in cache_pages table, if the search result is cached.

Removing the extensions query parameters from FE.cacheHash.excludedParameters will basically result in a similar scenario, since paginated links then will include a cHash for the given search term, filter and page number. Depending on the size of the search index, this might already lead to filled up cache tables when regular search engines crawl websites using ext:ke_search.

lksnmnn commented 1 year ago

Good point! In my case I (mis)use the search and use it without a search word. It simply provides a list of filter options used to filter down a list of pages. In this case the amount of possible combination of options is reasonable small.

I guess, the proper solution would be to write my own "filter tool for subpages" CE, which could be completely cacheable.

mbrodala commented 1 year ago

@lksnmnn Please read the linked explanation of the PRG pattern.

the search form request from ke_search is already using GET-Requests and not POST-Requests

And this is a problem, thus POST instead of GET would be necessary.

They are not being cached by Typo3 due to the way, the plugins are registered (USER_INT).

Which could be changed if the plugin did use PRG to provide cachable output.

As for routing configuration: this confuses me the most, because if I understand correctly one would need to add all possible filters and filter options in there to get it to work. However, the editor could introduce new filters which breaks this or requires a developer to update the config?

Exactly. If a filter is used which does not have a routing config, you will get a raw URL parameter for this filter and a cHash.

IMO search filters are not something editors should configure, but developers. In this case adding routing for them is no problem.

lksnmnn commented 1 year ago

Thanks! I guess @derhansen's point about the potential cache DOS still stands, so it might not be a good idea to force caching here anyway.

My usecase is, that the editor might want to introduce new categories, tag the content and make it available for filtering on the website. I do not believe this should result in me needing to update code. However, as said above, ke_search (or search in general) might not be the right tool for the job :).