dokuwiki / dokuwiki

The DokuWiki Open Source Wiki Engine
http://www.dokuwiki.org
GNU General Public License v2.0
4.17k stars 853 forks source link

Proposed improvement to make sidebars cacheable #3496

Closed gerardnico closed 2 years ago

gerardnico commented 3 years ago

Hallo,

The problem Most of the sidebar requires to disallow the cache with the tag ~~CACHE~~ to keep the content relevant.

Why ?

This three points makes that:

Proposed solution

Can we have a logical key (ie the id) instead of the physical one (the file) ?

parent::__construct($id . $_SERVER['HTTP_HOST'] . $_SERVER['SERVER_PORT'], '.' . $mode);

It would then be possible to pass a different id (logical id):

and they would not share the same cache.

Thanks for reading.

Klap-in commented 2 years ago

Typical use if wiki pages are involved is that wikiFN() is used to get $file from $id. tpl_include_page() uses p_wiki_xhtml(), which also uses wikiFN().

e.g. $cachefile = new CacheRenderer($id, wikiFN($id), 'metadata');

So it looks for me there is no difference between using $id or $file, for this case where wiki pages were requested. —— But my feeling is the file vs id is not the problem, but decision when it should be used. You can fiddle with the cache event (i.e. PARSER_CACHE_USE) to influence the caching logic.

The Navi an Indexmenu plugins do certain manipulation of the cache by changing metadata in the syntax renderer, and this event. Also include and move plugin might be interesting. But there are a lot more: codesearch: PARSER_CACHE_USE

General explanation of this event is given here: https://www.dokuwiki.org/devel:caching#plugins —— Actual this should be part of the navigational plugin (or other source of change you have?) you are referring to. Which plugin do you use?

gerardnico commented 2 years ago

Woah, thanks for the reaction.

It's a problem of conflicting cache output. (ie disabling the cache or not would not resolve anything)

I give an example below.

Example:

With the physical sidebar that lists the content of the actual namespace of the requested page.

If you have a namespace ns, the sidebar will generate the list of all ns pages into the physical cache file sidebar If you have a namespace ns2, the sidebar will generate the list of all ns2 pages into the physical cache file sidebar and overwrite the first output.

We could avoid this output cache conflict if the cache key was logical (what you want a logical id) but not physical (not based on the physical markup file path). As of today, the cache key is unique for each physical markup wiki file. Hence, you need to delete the cache each time.

Ps: I already implemented it but outside dokuwiki here

Klap-in commented 2 years ago

What is in your sidebar page that generates a list of pages?

gerardnico commented 2 years ago

Page explorer: https://combostrap.com/navigation/page-explorer/adding-page-navigation-easily-to-your-website-i6zpgpwn

Every plugin that has an output that is dependent on the page requested will get a cache conflict.

gerardnico commented 2 years ago

I'm talking about the render cache, not the instructions cache by the way.

Klap-in commented 2 years ago

Of course, with output that varies based on external info (that is not in the page or it’s metadata), DokuWiki could have wrong cached rendered output. e.g. Indexmenu can generate cache files per user or group in the ACL. You like to have it per namespace of the set of wiki articles shown next to the sidebar.

The wiki page used in the sidebar cannot know these things. So the plugin used in the sidebar should arrange these decisions. A plugin is able to do this by using the PARSER_CACHE_USE event.

Via this event you can change the cache key/cache file name.

Klap-in commented 2 years ago

The PARSER_CACHE_USE event is a more general event. The mode parameter tells for which parser information a cache is made.

Klap-in commented 2 years ago

Where can I find the code of the combostrap system? Is it this: https://www.dokuwiki.org/plugin:combo ?

gerardnico commented 2 years ago

Yes. That's my point. As of today, as a plugin author, I can't use the cache class of dokuwiki because it's based on the physical path of the sidebar (and not the id).

Technically, I create a virtual sidebar for each namespace (that has for instance, the logical cache / output id 'ns:sidebar','ns2:sidebar'.

In the same vein, I'm working on other bar (header/footer of the main content) that are not requested namespace dependent but requested page dependent (share button, permalink, ...)

As of today, each syntax component advertise this dependency.

gerardnico commented 2 years ago

Where can I find the code of the combostrap system? Is it this: https://www.dokuwiki.org/plugin:combo ?

Yes.

https://github.com/ComboStrap/combo

gerardnico commented 2 years ago

Logical Id code for the page (sidebar or any other bar) is here:

https://github.com/ComboStrap/combo/blob/main/ComboStrap/Page.php#L349

And is dependent on the scope variable (the value advertise the dependency - namespace or page). The value is set by the syntax plugin to advertise this dependency.

Klap-in commented 2 years ago

Please note that you can extend the key $event->data->key .= … e.g. sidebar#namespace, or sidebar#namespace:page, etc.., call getCacheName($event->data->key, $event->data->ext), and store it at $event->data->cache in a BEFORE handler for the PARSER_CACHE_USE event.

example is easier than such a story: https://codesearch.dokuwiki.org/xref/plugin/structat/action/cache.php#35

gerardnico commented 2 years ago

Oh my god. You are telling me that even if the event wraps the makeDefaultCacheDecision function, you can change all constructed properties.

I didn't grasp that at all. I saw that you could change the dependencies or the validity but I never thought that you could influence the storage location.

Thanks !

gerardnico commented 2 years ago

From a plugin perspective, it means also that all plugins that influence the cache location are not compatible.

If there is a request list, it would be great to be able to influence the creation of the key (to add a key dependency properties for instance).

Klap-in commented 2 years ago

If more plugins change the key (they should extend it by adding some uniqueness, not overwrite) you will get a lot of extra cached variants of the same page. It is not conflicting, but if you have too many different plugins adding uniqueness, you will less likely get a cached version. If done right, plugins should not add more uniqueness then needed, but sometimes you will have overlap. e.g. if you add user to the key, to prevent ACL skipping.

gerardnico commented 2 years ago

Ah yeah... The actual cache key should also be extended and the corresponding file should be also recalculated. I wil do it.

Many thanks for the new perspectives on how it was designed.

michitux commented 2 years ago

The more I'm thinking about the whole xhtml cache in the case of dynamic content the less I'm sure it is the right approach to enable conditional caching as in the include plugin or dynamic cache keys. Why? The problem is: to ensure that what you are displaying in your tree of pages represents the current set of pages, you need to re-scan the directory to make sure no page has been added and no page changed its title (DokuWiki supports external edits). Further, you need to ensure that a) you are not listing a page a user has no read permissions for and b) you are not excluding a page that is not part of the cache because the user for whom the cache has been generated had no permission for it - that's exactly the cause of dokufreaks/plugin-blog#106 where I just forgot this case. Therefore, if you want to fully support everything DokuWiki supports, enabling caching for a sidebar with a tree view of all pages is not as simple as just including the current page's id in the cache key.

Of course, if you know more, like all pages have the same ACL rules and there are no external edits, you can start caching more aggressively. However, for the general case, such knowledge is not available.

At least in this general case, I think in the time that it took to verify that the cache can be used, you could have just re-rendered the content of the sidebar. Rendering is fast, at least if all disk accesses are cached. And the latter is maybe something where things could be improved. For example, you could create an internal cache with the list of pages including all metadata you need (but without ACL checks applied) and update that cache when the indexer is run and then just use this cache to generate the navigation tree. This cache could also be in an SQLite database, allowing you to quickly select relevant subsets of the pages even in large Wikis. I'm thinking such an approach with an extra cache could also be very useful for the blog plugin and the namespace include option of the include plugin (for the include plugin we could for example even cache pre-processed instructions that for example just contain the relevant section).

gerardnico commented 2 years ago

Thanks for the feedback.

For sure, a cache system can become pretty complicated.

The use case is for public publishing (open frontend) therefore there is no ACL and the page performance is pretty critical. A lot of read for very few write (almost static)

That's true that page metadata consistency (title, description) is still not perfect while editing. Dokuwiki miss a metadata store change event or I didn't see it.

Success with your dev.

gerardnico commented 2 years ago

By the way, for real dynamic application, you can get rid entirely from the cache by making javascript ajax call :)