joomla / joomla-cms

Home of the Joomla! Content Management System
https://www.joomla.org
GNU General Public License v2.0
4.73k stars 3.64k forks source link

Module caching breaks WebAssetsManager #37122

Closed nikosdion closed 2 years ago

nikosdion commented 2 years ago

Making the bat-signal to @bembelimen @richard67

WARNING! This has some moderate security implications which can lead to partial Denial of Service on Joomla 4 sites using caching under certain circumstances.

Steps to reproduce the issue

Go to the frontend of your site and click through IN THIS EXACT ORDER:

Expected result

I can see the contacts

Actual result

Unsatisfied dependency "com_categories.shared-categories-accordion.es5" for an asset "com_categories.shared-categories-accordion" of type "script"

System information (as much as possible)

Completely irrelevant. I have reproduced this issue on a plethora of local and live environments with PHP 7.4, 8.0, 8.1 and the latest versions of Safari, Chrome and Firefox. This is an architectural Joomla issue.

Additional comments

You can confirm this only happens with conservative caching enabled by turning off the cache OR setting it to Progressive and trying the same click through path. You will see that the expected result of showing the contacts does happen. Therefore we are 100% sure this is a caching issue and ONLY with the Conservative caching option.

This is far from being an edge case. Real world sites are affected:

The problem has to do with the way Conservative caching works with mod_menu and extensions having dependencies to assets from extensions other than themselves.

For example, com_contact's “List All Categories in a Contact category Tree” menu type corresponds to com_contact's categories view. The default.php view template for that view contains this code:

$wa = $this->document->getWebAssetManager();
$wa->getRegistry()->addExtensionRegistryFile('com_categories');
$wa->useScript('com_categories.shared-categories-accordion');

This is fairly normal. This com_contact view has a dependency to an asset from com_categories. As we all know, the WebAssetManager does not perform discovery of dependencies. It expects the PHP code running on the server to tell it which asset registry (joomla.asset.json) files to load. If the correct file is not loaded it will throw a Joomla\CMS\WebAsset\Exception\UnsatisfiedDependencyException. Therefore com_contact does exactly that in the categories view.

When you access the Contacts menu item in the reproduction steps above Joomla caches com_contacts' output including the external dependency and the fact that it needs to load the com_categories assets file. That's why revisiting the Contacts page itself works even when the page is cached and the code in the view template does not run.

You know what else gets cached? The menu module (mod_menu). However, if you take a look at what is cached you will see that the only thing that gets cached in the assetsManager key is the assets array. The registryFiles array is NOT cached.

When you access a page under a menu item (which does not have a manu item of its own, like clicking the Uncategorised link in our example) Joomla loads the cached mod_menu for the active Itemid. This also applies the head data included in the cached module which, I repeat, DOES NOT include the registryFiles the WebAssetManager needs to load to find these assets. In our example it registers a dependency for com_categories.shared-categories-accordion WITHOUT telling the WebAssetManager that the media/com_categories/joomla.asset.json registry file has to be loaded.

Since our page is for a different view its view template page does not try to load the com_categories registry file; it has no use for it, that's pretty normal. Yet, we have the darned com_categories.shared-categories-accordion dependency injected by mod_menu.

When Joomla tries to render the HtmlDocument's head it calls Joomla\CMS\Document\Renderer\Html\MetasRenderer::render which calls WebAssetManager::getAssets() which in its turn calls the enableDependencies method in WebAssetManager. As soon as we hit the com_categories.shared-categories-accordion asset added by the cached menu module the WebAssetManager cannot find that asset's com_categories.shared-categories-accordion.es5 dependency (since we have not loaded the com_categories' registry file) and throws an exception.

Now, what happens if you follow a different click through path on our site after deleting all cache? If we try to visit the internal page (index.php/contacts/uncategorised) BEFORE the top-level menu item page (index.php/contacts) we see that BOTH pages work. In fact they work before and after they get cached. Why, you may ask. Let me enlighten you.

When you access the internal page first there are no dependencies to com_categories. The menu module (which does not itself use such a dependency, mind you!) gets cached without the com_categories.shared-categories-accordion dependency.

When you then visit the to-level menu item's page the cached mod_menu (without a com_categories dependency!) is loaded. However, the com_contact page does introduce this dependency and it's correctly cached along with the registry file where it is defined. Therefore trying to load this page after it's cached works.

Wait. What just happened here?! Joomla behaves completely differently depending on the click through path! This is where things have gone TERRIBLY BAD.

The cache can expire at any time, right? And visitors — including search engines — can follow whichever click path through the site, right? Yes, people, you got it right. If a visitor or a search engine follows the “wrong” click through path when the cache is expired they might trigger this caching bug.

Now, you are thinking, darned, this is pretty bad already. And yet, it gets worse.

Not content with reproducing an issue in lab conditions, I tried to divine how the heck did I see the homepage of my site which does not use com_contact at all to display an error linked to com_contact. And, what do you know, I hit the motherlode.

Which menu Itemid is used to cache modules when there is no actual menu item pointing to that component? The home page menu Itemid, of course! Therefore a bad URL (from a stale search engine link or someone being a malicious son of a gun) can cause your site's homepage to throw an error.

I am not going to say in a public issue how trivial it is to trigger this to cause a Denial of Service attack on the homepage of any site which uses Joomla 4 and has caching enabled. I already gave enough away. Core maintainers who don't know can contact me privately (use my site's contact page if you don't have my email) to give them the information.

Your only recurse? Turn off caching globally; OR set all of your modules to not cache.

Technically speaking, the problem is not triggered by mod_menu specifically. It's triggered by the caching of modules. That's because Joomla\CMS\Cache\Cache::setWorkarounds tries to divine which dependencies were added by the module but does not in fact process the assets correctly. If there are ANY asset dependencies registered in the WebAssetManager — even those added by a different component or module previously rendered on the page, or even a plugin — it will cache them as though they belong to the module being currently rendered. This is bad.

Of course this is even more serious than our already serious issue! Yeah, this is the bug that keeps on giving, ladies, gentlemen and non-binary folks! Modules cache the dependencies of whichever component view they were rendered with when they got cached. If the Itemid remains the same but the component or view changes these dependencies may be invalid (immediate exception, your site is dead) OR not apply to the current page (possible JavaScript error with unknown consequences).

Well, you might ask, what about Progressive caching. It seems to be unaffected. Ah, my sweet summer children! Yes, it does seem to be unaffected but, no, it is affected all the same. The difference is how Joomla works under the hood with each cache mode. With Conservative caching each module is cached individually. When you ask Joomla to render a module position it goes and retrieves each and every module from the cache to render it on the page. As a result we hit the caching workarounds bug pretty consistently.

With Progressive caching each module position for each page of your site is cached as a separate com_modules cache entry. However, components do not go through the Cache::setWorkarounds and Cache::getWorkarounds, sidestepping the bug in Cache::setWorkarounds. In fact, they cache the entire WebAssetsManager state including which registry files to load. That's why Progressive caching seems to be unaffected: caching the modules and the com_modules entry happened at the same time, therefore you're served the cached com_modules entry which registers the (wrong) dependencies AND the registry files where they can be found. Ah!

However, I did say that Progressive caching is affected, didn't I? Well, yeah. If the module cache has not expired BUT the com_modules has expires Joomla will try to reconstruct the com_modules cache entry. This will do exactly the same thing Conservative cache is doing and your site will be broken. Normally you won't see that happening because the module cache time is divisible by the global Cache Time, therefore both caches expire simultaneously and you don't hit the bug. However, if you have at least one module with a cache time that's not divisible by the global Cache Time the two caches will get out of sync and you'll hit the issue as long as this desynchronisation happens when you are visiting the inner page (not the top level menu item page). This is extremely unlikely but not impossible.

Therefore, modules caching is broken and can cause a serious issue that may result in a partial Denial of Service for your site. So, for me, this is a security issue as much as an architectural issue.

In its core, it's a simple bug though; a mere oversight in Cache::setWorkarounds which is not doing what it is supposed to do as we've seen before on Joomla 3 and 4. All it needs is one more special treatment case for the WebAssetManager's assets.

PR incoming!

I am filing this issue just to link to my PR coming in a few minutes.

bembelimen commented 2 years ago

Good catch, thanks for the explainations and the fix!

Fedik commented 2 years ago

I did not read whole thing, will do it later. Can you please test #35776, we already have for Module caching

Fedik commented 2 years ago

The registryFiles array is NOT cached.

If it is true then that something went wrong somewhere. Because it should be already cached with: https://github.com/joomla/joomla-cms/blob/d5cb4132d346f1b83321a917cbb209bbfc3f9346/libraries/src/Document/HtmlDocument.php#L166

Exactly for a reason to keep all dependencies.

nikosdion commented 2 years ago

Nope, the other PR will not work. They’ve missed the fact that the assets key is an array of arrays and that the INTERNAL arrays is where the problem happens (you can’t array_diff them because web asset items can’t be converted to string).

Moreover they fudge the registry files which is unnecessary, these are already handled correctly (and is part of the problem).

In other words, that PR is broken.

I was there about an hour into my debugging. I decided to do it differently because I spent the extra time to understand exactly WHAT is broken and HOW it affects Joomla.

nikosdion commented 2 years ago

Also not that the REAL PROBLEM has nothing to do with registry files being cached.

It has everything to do with modules caching dependency assets THEY DID NOT INTRODUCE.

Sorry for caps, I’m on mobile now, can’t do proper italics.

Fedik commented 2 years ago

Okay, thanks for explanation

Fedik commented 2 years ago

I tested both, your and @Denitz fixes, both works good, for current issue. Keeping registry files is necessary, because some dependency may be inactive while call of $wa->getManagerState(). This why I would prefer second PR, also he was first ;)


And I did little check also.

The registryFiles array is NOT cached.

You are right this actually one of main cause of the problem, setWorkaround() ignores it.

Can check with easy fix:

if ($now === 'assetManager')
{
    $newvalue['registryFiles'] = $value['registryFiles'];
}

Between this lines https://github.com/joomla/joomla-cms/blob/74a225ecddb3c4db82965b83833fc2625236fef1/libraries/src/Cache/Cache.php#L702-L703

It has everything to do with modules caching dependency assets THEY DID NOT INTRODUCE.

This also true. setWorkaround() ignores all assets for module.

alikon commented 2 years ago

missed the relation with #35776 btw Nomen omen setWorkaround :smile:

nikosdion commented 2 years ago

@Fedik The PR from @Denitz does NOT work. True, it makes the symptoms disappear but it DOES NOT FIX THE REAL ISSUE. This is the same as my second attempt at fixing it today (my PR is the third and final attempt). All three worked insofar they solved the symptoms but only the third and last which ended in this PR is actually correct. Let me explain.

  1. Auto-discover dependencies (around 10:40 EST) My first attempt was to patch the WebAssetManager to auto-discover the extension for an unresolvable dependency. Explode the asset name by dots, assume the first one is the extension name per Joomla best practices and load that extension's registry. However, this begged the question why the heck do I have an unresolvable dependency.
  2. Cache registryFiles (what @Denitz is doing; around 13:20 EST — I had to do some paid work in the meantime). Okay, we have unresolvable dependencies because the module caches assets without caching the registryFiles. So let's do that and Bob's your uncle. It's great, just two lines of a code: an if-block and a line to force the registryFiles keys when $now === 'assetManager'. Or so I thought.
  3. Actually parse assets (about 14:30 EST). As I was writing my reasoning behind this I realised that the second solution IS A BUNCH OF BULLSHIT. Like, dude, why the fsck does mod_menu cache a dependency for com_categories even if it comes from com_content?! THIS IS THE MENU. IT DOES NOT MAKE USE OF THESE DEPENDENCIES. ALERT ALERT ALERT! The ACTUAL problem is, therefore, that the module caching is adding assets the module HAS NOT inserted into the document. This is because the code in setWorkarounds only works for SINGLE LEVEL ARRAYS, not arrays nested THREE LEVELS DEEP like the state returned by the WebAssetsManager.

Again, all three solutions relieved the symptoms. However, the first two ONLY treat the symptoms, they do not fix the real bug: module caching caches assets which were not added by the module itself.

Therefore the PR from @Denitz is wrong as it does not fix this problem. With his PR the module caching STILL caches assets not added by the module, it just adds the registry files to go with them.

Yes, caching random assets in a module is wrong.

First of all, it means that if the same cached module is used across two or more components which load different JavaScript or CSS assets with conflicting code you'll get a conflict on your site. An extension will appear broken even though its developer did NOTHING WRONG.

It's not a matter of “being first”. It's a matter of “being correct”.

If I may also add something to your “being first” comment: you are immature and dangerously reckless for sitting on that issue FOR FIVE MONTHS. Let me say this again. Even though you were fully aware that module caching does not work since September 2021 you did nothing to fix it in Joomla 4.0 and let Joomla 4.1 be released with the same grave issue.

You see, it's not just about modules being cached. This is a major security issue.

Here's how to bring down the home page of any Joomla 4.0 and 4.1 site which has Conservative caching enabled. Keep accessing the URL /index.php/component/contact?view=categories every 10 seconds. This works even if you don't even use com_contact on your site! This is what happened to me in October and I'd been trying to figure out why ever since. The way this works is that Joomla uses the default Itemid (Home item) when the URL being accessed does not itself resolve to any menu item. Once you hit that URL as soon as the cache of any of the modules on the home page has expired it gets cached again with the assets set by the com_contact categories view. When you try to visit the legitimate homepage this wrong dependency will try to get loaded and the site breaks. Use a botnet to hit it every 7–12 seconds and you have a persistent Denial of Service to the homepage of the site that the site owner cannot defend against except by turning off the cache — in which case the additional processing required per page will make the site slower to the point that the site might actually be unable to serve content (again a DoS). Boom! Joomla, the CMS trusted by millions who are let down by the one person who dickered about this issue for five months.

But, yeah, please go on and tell me why "being first" is important and I'll show you someone who doesn't understand site security. This issue is why Akeeba's site has not been migrated to Joomla 4 yet. I can't have a company making security extensions for Joomla when we know that there's a bug in Joomla which can randomly bring down our site (like it happened with my blog) — let alone now that we know there is a MAJOR SECURITY ISSUE in Joomla which could bring down our site's homepage by simply accessing a URL over and over! What kind of shitshow is this?! I bloody warned you this is a security issue but no. You wanted to play the “you were not first” card implying I don't know how to do a proper issue analysis. Okay. Enjoy your zero day public disclosure.

Fedik commented 2 years ago

I am sorry, maybe I not understood something. Your and Denitz compare Assets and add missed. However you compare objects but Denitz compare keys/names, that all difference. (Well, probably compare objects would be more safe)

However both works.

As I was writing my reasoning behind this I realised that the second solution IS A BUNCH OF BULLSHIT. Like, dude, why the fsck does mod_menu cache a dependency for com_categories even if it comes from com_content?!

I tell you something, but keep it secret, okay? I wrote this bullshit :wink:

You always should keep registry files which was used when you added specific asset. Your second step was actually also good one. However, the bug here is that the module should cache only own registry files, instead of all.

if you add registry files diff to your PR, that would be nice.

For info: the registry file parsed only once per file, even though it may be added 100+ times in runtime.

nikosdion commented 2 years ago

@Fedik With the current state of the PR I was seeing the mod_menu JS being loaded without a problem in all pages where the cached module was served. This makes perfect sense since the extension's registry file is loaded when the extension is rendered, be it a module or a component.

Did I miss something? If you can give me reproduction steps and what happens vs what should happen I can debug it in the morning and amend the PR.

That said, I still call bullshit on your proposed solution 😛 What you want to do is cache ALL of the registry files already loaded on the page when we are caching a module. Here's a practical example (when I call bullshit on something the onus is on me to prove it — fair's fair!). When we reach that point of execution in the reproduction site (trying to cache mod_menu) I see a bazillion registry files including com_contact, com_content, core, bootstrap and so on and so forth. Clearly mod_menu has not added any of them on the page. Caching them with the module would be wrong.

If there's a case where a module adds a different extension's registry which is not already loaded and it doesn't get cached that would indeed be an issue that needs fixing. The only case which we could never catch, though, is if the module is adding a registry file for a different extension which is already loaded on the page. The solution to that is rewriting the module to load the registry files in its main entry point instead of its view template.

As a matter of fact, components should be doing the same: adding registry files of external extensions should happen in the Controller, not the View or view template.

If we want to talk about proper architecture, okay, let's do it. The way WebAssetManager is currently implemented is wrong and this is the root of all evil. It should perform auto-discovery of the registry files (the first solution I tried, per my comment above). It shouldn't expect the developer to explicitly declare the external extension when the extension's name is already present in the name of the dependency which will be loaded anyway.

If this change is applied we no longer need to have the WebAssetManager return a state which includes both assets per type AND registry files; we only need the assets per type. Restoring the active assets would automatically load the corresponding registry files since all assets follow the pattern extensionName.assetName.

This is a very trivial fix to implement. If you really want to fix the WebAssetManager architecture and be done with all the caching and asset loading issues once and for all I can make a separate PR. I stashed that code this morning thinking it might come in handy.

Fedik commented 2 years ago

Did I miss something?

This:

If there's a case where a module adds a different extension's registry which is not already loaded and it doesn't get cached that would indeed be an issue that needs fixing.

It is made to allow to include eg com_foobar registry in mod_blabla/tmpl. Therefore it should be cached within mod_blabla

And here nothing wrong with it. It like you would call addScript(blabla.js) in mod_blabla/tmpl, but with dependencies and stuff.

What you want to do is cache ALL of the registry files already loaded on the page when we are caching a module

No, see my previous point. Only registry added by module. Your PR missed this "registry diff". However even when module will cache all, nothing bad will happen, because as I wrote before, each registry file parsed only once, even when each of 100+ module will add it.

nikosdion commented 2 years ago

https://github.com/joomla/joomla-cms/issues/37122

@Fedik While I can add a check to compare the registryFiles this is still incorrect. The whole caching workarounds is centred around a blatant lie: that you can reliably conclude which assets and scripts were added by an extension by comparing the header before and after.

Comparing before and after document head states is wrong

You have a site with the following extensions:

The /menu.html page on the site is of type com_diner.menu, has a mod_eggs module and has a Custom HTML module with a plg_system_diner plugin code in it to promote Poppy's World Famous Raspberry Pie.

The menu page can be cached, alright. What happens when the /menu.html page is rendered by Joomla?

First, Joomla loads the plugins. Our plg_system_diner plugin anticipates displaying Add to Order buttons so it preloads its dependencies onAfterRoute.

WebAssetManager State 1 (after plg_system_diner):

Joomla then renders the component.

WebAssetManager State 2 (after com_diner):

What gets cached is the difference of WebAssetManager states 2 and 1, i.e.:

If at this point you haven't gone “oh, crap” it gets worse.

Joomla then renders the modules. So, first we have the mod_eggs module.

WebAssetManager State 3 (after mod_eggs):

What gets cached for the module is its difference between WebAssetManager states 3 and 2, i.e.

Then Joomla renders the custom HTML module mod_custom. It does not load any assets, no assets are cached for it.

And now, you disable plg_system_diner

KA-BOOM! YOUR SITE IS BROKEN

You see, when you try to access the /menu.html page Joomla tries to load com_diner's menu view from the cache.

However, when it got cached the com_diner.menu#style and com_diner.menu#script dependencies were NOT cached as they were already present in the document. So now com_diner appears to have ‘broken CSS‘ and ‘not work’ because neither its CSS nor its JavaScript is loaded.

(Yes, it also didn't register the registryFile but that doesn't matter because Joomla always loads the registry file of the extension being rendered for components, modules and templates, therefore media/com_diner/joomla.asset.json is already loaded).

Let's say that you solve that by moving the asset registation from the view template into the Controller. The Controller code is always executed up to the point you call display(). This is the point were the cache is checked. So anything before that, including registering assets, will always run.

And now, you remove the plg_diner_eggs.selection from com_diner

You decide that if you have the mod_eggs module anyway there's no need to have a separate eggs selection in com_diner. Therefore the plg_diner_eggs.selection dependency is removed from com_diner.

KA-BOOM! THE SITE IS BROKEN, AGAIN!

You see, when mod_eggs got cached we didn't register any dependencies or registry files because everything was already loaded before we got to the module. However, we now no longer have com_diner loading plg_diner_eggs.selection. Therefore this required dependency is no longer loaded for mod_eggs.

Neither the developer nor the site owner did anything wrong!

The developer is loading the correct dependencies through Joomla's WebAssetManager. They expect that Joomla will figure its shit out but it doesn't.

The site owner reasonably expects that unpublishing an optional plugin won't break a component not using the plugin and that toggling a switch in a component won't break a module which doesn't use the component. And yet, here we are.

You CAN NOT cache every single dependency already added on the page

I think it's obvious and why Joomla's setWorkarounds is already doing the diff. In case it's not obvious: you can't pull in random dependencies to pages OTHER than what has been already cached, it may cause things to appear wonky (CSS conflicts) or not work at all (JavaScript conflicts) without the developer doing anything wrong!

So what CAN we do?

Joomla does not have a way to know which dependencies are positively being added by a specific extension. This would require having some sort of WebAssetManager running operations log with checkpoints. We would set a checkpoint before rendering an extension, set another one after rendering an extension and log the actions taken on the WAM between these two checkpoints. This is the only way to ensure 100% accurate caching.

The second best option is to document that loading dependencies from a view template is UNSUPPORTED AND WILL BREAK CACHING. In this case the assets loading is moved into the Controller (components), entry point (modules) or helper classes (templates). However, this also prevents JLayouts from loading dependencies which is unrealistic.

What you asked me can be done but IT IS STILL BROKEN

So, yes, I can do a diff for registryFiles. However, caching will STILL be broken.

If you want everything to cache correctly we need to have the WebAssetManager AND the HtmlDocument set checkpoints and log actions: registry file added, dependency added, dependency removed, dependency set, dependency unset, script added, style added.

Anything else will ALWAYS break in the real world.

So, yes, I am going to address this ONE point you made but I am warning you that this WILL NOT WORK CORRECTLY in the real world because Joomla's caching is inherently broken, based on assumptions which are NOT TRUE in the real world.

If you are so strict about architecture, fix it, dammit!

Fedik commented 2 years ago

Clarification about your com_diner example. If com_diner has plugin dependecy (that strange but) then it should add registry file on it own, explicitly. $wa->getRegistry()->addExtensionRegistryFile('plg_diner_eggs'); Same for mod_eggs, it should explicitly add $wa->getRegistry()->addExtensionRegistryFile('plg_diner_eggs'); And if cache will work correctly, all will continue to work.

it may cause things to appear wonky (CSS conflicts) or not work at all (JavaScript conflicts) without the developer doing anything wrong!

Maaan, that why we have dependecy ;)

About breakpoints, I doubt it will help, but I am sure it will add much complication and no one really interesting in it ;)

Tip and tricks: If you do not want to rely on registry files, then you should cache all active and dependencies assets. For this, need update this: https://github.com/joomla/joomla-cms/blob/05d3b143887b4ef5286f80c388118b8e7b5b6a59/libraries/src/WebAsset/WebAssetManager.php#L748-L754 To this:

public function getManagerState(): array
{
  // Make sure that all dependencies are active
  if (!$this->dependenciesIsActual)
  {
    $this->enableDependencies();
  }
  return [
    'registryFiles' => $this->getRegistry()->getRegistryFiles(),
    'activeAssets'  => $this->activeAssets,
  ];
}

Then update getHeadFata, to make sure all active is loaded: https://github.com/joomla/joomla-cms/blob/d5cb4132d346f1b83321a917cbb209bbfc3f9346/libraries/src/Document/HtmlDocument.php#L174-L180 To this:

foreach ($assetNames as $assetName => $assetState)
{
  $waState['assets'][$assetType][] = $wa->getAsset($assetType, $assetName);
}

Then you do not need to keep registryFiles, all assets will be in head data (all active and their dependency).

nikosdion commented 2 years ago

@Fedik I seriously doubt your ability to understand how Joomla works and how it's used in the real world, i.e. outside of a lab environment using only core extensions.

If com_diner has plugin dependecy (that strange but) then it should add registry file on it own, explicitly.

Yes, the dependency to the the plugin's asset DOES MEAN that the component adds the plugin's registry file. I explicitly stated that this happens (see WebAssetManager State 2 and read CAREFULLY through the registry files loaded, dammit!). All you are proving is that you either didn't read what I wrote or do not understand how this works.

In any case, this was NOT the point I was making. My point is that setWorkarounds does a diff and that diff is UNRELIABLE in the real world.

IF any extension running before me has already registered a dependency / registry file that I am also registering THEN my cache document will NOT include this common dependency.

This is because setWorkarounds caches the diff between before-I-ran state and after-I-ran state.

Therefore Joomla's cache is UNRELIABLE in caching dependencies BY DEFINITION.

Try thinking about this. It became instantly clear to me when I started debugging setWorkarounds yesterday. It was a major “oh shit” moment because I realised that no matter what Joomla's caching of modules and extensions will ALWAYS be broken.

Maaan, that why we have dependecy ;)

Bzzzzt! WRONG!

Dependencies only work when: a. They are set in the WebAssetManager; and b. The registry files for them are loaded

The problem is, as I have been saying, that Joomla's setWorkarounds: a. Will not include all assets for the cached extension IF the same asset has been registered by ANOTHER extension which rendered before it b. Will not include all necessary registryFiles for the cached extension IF the same registryFile has been loaded by ANOTHER extension which rendered before it

So, yeah, I do get why we have WebAssetManager and I enthusiastically using it in my own extensions BUT the fact remains that caching in Joomla is still a rolling Dumpster fire which breaks the WebAssetManager.

Tip and tricks:

Jesus wept! THIS IS WRONG! You are trying to cache ALL DEPENDENCIES FROM ALL EXTENSIONS ALREADY RENDERED in the component's or module's cache! Yes, thank you, I know how to do this but I have already explicitly stated that it is really stupid.

Let me give you a simple, silly example.

Let's say option=com_diner&view=menu is on two Itemids, 42 and 69. And yes, real world sites do shit like that because you may want the same component and view but different modules (remember that Joomla controls display of modules by menu item ID!). Real world use case example: you want to have a home page for the in-store iPads which display the menu with all the trills and frills (modules designed to sell diners more stuff). You also want another instance of the menu without trills and frills for use on the host's counter for the host to enter to-go and pickup orders given over the telephone.

Itemid 42 includes a module which loads a piece of JavaScript which shows a popup, just to make this simple. A popup that must only be shown on Itemid 42.

Now, you are caching com_diner.menu with the popup dependency on Itemid 42.

When visiting Itemid 69 the cacheId for the component view is the same so instead of running com_diner.menu you get the cached document. The cached document injects the popup dependency so now Itemid 69 shows the same popup as Itemid 42 EVEN THOUGH the module that is meant to control this behaviour is only published on Itemid 42!

Are you seriously suggesting that this is desirable behaviour? I call this “Joomla is fucking broken”.

Moreover, your “tip” is doubly broken because it does not take into account that the dependency itself might change between the document being cached and the document being served. For example, someone visits Itemid 42 and the com_diner.menu is cached for the next 15 minutes. 2 minutes later an update to com_diner is made — in a way which does not clear the cache, e.g. a CI deployment pipeline instead of going through the extensions installer — and the com_diner.menu#style asset now depends on the brand new com_diner.menu_dark#style asset.

If you are caching the asset name (com_diner.menu) this is not a problem. The new registry file is loaded with the added dependency and asset and rendering the page now includes both com_diner.menu and com_diner.menu_dark.

With your approach of cachign the asset object only com_diner.menu#style is loaded, without com_diner.menu_dark.

That's why Joomla is caching asset NAMES, not asset OBJECTS.

Do I seriously need to explain how WebAssetManager works and why to you?! Go grab a coffee, man. You sound like you need one.

Joomla's caching is still broken

It doesn't matter if you use the WebAssetManager or go cowboy style and inject styles and scripts like it's 2012. Joomla's cache, namely its setWorkarounds, only works on a diff of the head data.

As far as I, as a 3PD, am concerned I will move all my WebAssetManager stuff from the view template into the Controller of my components and the entry point of my modules. This will mitigate whatever fuckery goes on with setWorkarounds as I can guarantee that the code which runs despite cache being enabled will inject the correct assets and registry files.

However, as I am a 3PD I also need to use Joomla's layouts which also inject stuff in WAM. I cannot control them and I cannot guarantee that Joomla won't end up in a situation where I have done nothing wrong but Joomla screwed up and broke my clients' sites. So all I can do is tell them “if you change anything in the backend of the site, clear cache because Joomla is broken”.

I already explained how Joomla can be fixed to not have this problem but you explicitly denied it's something the Joomla project is interested in because you guys don't understand how your product is used in the real world:

About breakpoints, I doubt it will help, but I am sure it will add much complication and no one really interesting in it ;)

Therefore I am going to link to your denial next time someone complains about a caching issue, letting them know that I did explain it to the Joomla core maintainers, I explained the nature of the problem, I explained how to fix it and even volunteered to help with the fix but they were simply oblivious to how their product is used and against it getting fixed.

Thanks for playing.

brianteeman commented 2 years ago

Guys take a chill pill please and @nikosdion please when you have a debate with one person refrain from tarring everyone with the same brush. Your last paragraph is offensive to everyone (including you) who contributes to Joomla and I am sure that its just your frustration playing out so please talk about the topic and not the people (especially all those who have not said a word here)

brianteeman commented 2 years ago

and please re-open the pr. It has several good tests and closing it is just biting off your own nose to spite your face - and that hurts

Fedik commented 2 years ago

Yeah I know that setWorkarounds is broken for modules.

However my comment was with assumption that it "doing diff correctly" or read it as "it was fixed to doing diff correctly".

I will move all my WebAssetManager stuff from the view template into the Controller of my components and the entry point of my modules

I would not recommend doing it. It a templating stuff. It same as you would call addScript(foobar.js) in component controller. Like it was in 1.5, 2.5.

and please re-open the pr.

@brianteeman It is open, no one against it, we just talkind about random stuff here, about improvements , to avoid spam there ;)

nikosdion commented 2 years ago

@brianteeman Is it really offensive?! What am I expected to do when I have a user of my free of charge extensions who is affected by this issue?

Tell them that I am doing something wrong when I am, in fact, using Joomla exactly as the core does? They'd demand a fix which I can't offer because it's not my code that's broken to begin with.

Hack core to fix Joomla's caching? I refuse because core hacks are fundamentally wrong and I am philosophically opposed to them.

The only option I have is to tell my users the truth: yeah, Joomla is broken, I already explained how, I volunteered to fix it, they didn't want to, here's the issue.

I am sorry that you find the objective truth offensive. If y'all don't want to be offended then let's try to actually fix Joomla. Right?

My offer to help is still open. If anyone's got a problem with the code being attributed to me, I don't care about attribution! You can commit the code under your name for all I care. I just want caching to be fixed for EVERYONE using Joomla out there in the real world! My sites can survive without caching, many real world sites can't. That's the beginning, middle and end of my motivation to contribute.

As for the PR, nobody closed it. It's the issue that's closed which is standard operating practice in this project when there is a PR for it. My PR will remain open because I believe it's the best short term / non-b/c breaking solution to the module caching issue. At the very least it does not cause any more issues than it solves.

nikosdion commented 2 years ago

@Fedik

Yeah I know that setWorkarounds is broken for modules.

It is very broken for modules but it's also broken for components.

However my comment was with assumption that it "doing diff correctly" or read it as "it was fixed to doing diff correctly".

I am honestly confused by the wording. To make it clear, what I am saying that it BOTH does diff wrong AND that doing a diff in and of itself is wrong.

Doing a diff is wrong because we can't know if the extension we rendered tried to use an asset, add a registry file or inject head data which was already used / added / injected by an extension BEFORE it.

If two extensions try to add the same thing in the document head only the FIRST one gets cached with that thing. The latter one doesn't and this can lead to problems.

I would not recommend doing it. It a templating stuff. It same as you would call addScript(foobar.js) in component controller. Like it was in 1.5, 2.5.

I disagree and I would say that it depends.

If all my views need a common dependency (e.g. a common CSS file) it does not make sense adding it on every view template, it makes sense adding it in the Dispatcher. The user can override it; if they override it with an empty file it won't even get loaded.

Moreover. I've been there and done that long before Joomla did. Back in 2013 – 2016 I was loading static resources in view templates.

Problem 1: every time I updated something my users would have to go over all their template overrides (they never did and even if they did they complained a lot). This is PARTIALLY solved by the WebAssetManager but only if you create a preset asset for each and every view and layout of your component so you can manage asset loading by changing joomla.asset.json. Of course you cannot override joomla.asset.json just like you can't override the Controller so what exactly did you gain? Nothing. You're either forcing people to review all their view templates even if you only changed asset loading OR you are giving the same inflexible one-size-fits-all asset management you'd be giving them by setting the assets through the Controller.

Problem 2: the only way to figure out whether something should conditionally load was to have the Controller pass model state for the view object to set a property, or set a property directly in the view object, so the view template would control loading of the static resource. This was not a problem with FOF but it is a problem with core MVC, it requires overriding the getModel / getView methods in the Controller, otherwise display() passes a virgin model object to a virgin view object and we can't pass any kind of state from the Controller. Joomla "solves" this by having the model directly access the input data, violating the separation of concerns. So, architecturally, you're screwed anyway!

Technically, this would be solvable by middleware. Conceptually, they would sit at the same level as the Controller. Hence my suggestion to use the Controller since we lack middleware in Joomla 4 and maybe, just maybe, we'll only have them in Joomla 5.

I actually use a mix of loading dependencies in the Dispatcher, Controller, View or view template depending on the context. For example, on Akeeba Ticket System I have a global switch for loading a custom frontend CSS file for when your template isn't based on Bootstrap 5. This file is loaded in the Dispatcher instead of polluting all my views with the same if-block. When I know that a dependency I am loading only works if I pass data to the frontend I do that in the View or Controller, depending on where this data comes from, typically the View (I don't remember having had to do that in the Controller except some very special cases in the backend of Akeeba Backup).

Anyway, this is a moot point if caching isn't fixed; I can't keep on doing anything in the View or view template if it means that I run the risk of my asset not being properly cached because another module or plugin I've shipped with this extension also uses the same asset. I have to do it in the Controller even when it's not architecturally appropriate :(

brianteeman commented 2 years ago

@brianteeman Is it really offensive?! What am I expected to do when I have a user of my free of charge extensions who is affected by this issue?

It is the way that you are blaming everyone else when only one person has even commented

As for the PR, nobody closed it.

sorry I didnt notice this was an issue not a PR - I was responding on my phone

brianteeman commented 2 years ago

I assume a post has been deleted as I dont see it here but I got an email - or maybe its a cache thing.

Anyway to answer it is very simple. Just because one person doesnt like something doesnt mean it wont be accepted. The final say is the release lead.

nikosdion commented 2 years ago

I deleted my comment because what you wrote @brianteeman wrote is nonsense in practice. All maintainers with commit rights have the authority to accept or decline a contribution without any further checks or an appeal process. 99% of my interactions with the Joomla project is through the maintainers, not the release lead. I never had a release lead reverse a decision made by a maintainer — even though 6 to 36 months later it was evident that I was right all along, without even getting a simple “sorry, we dropped the ball”.

Moreover, when it comes to architecture, which release lead should I talk to? The current minor version's, the next minor version's or the next major version's? Okay, let's try all of them and see who responds.

Pinging @bembelimen as the release lead of Joomla 4.1, @roland-d as the release lead of Joomla 4.2 and @HLeithner and @nibra as the release leads of Joomla 5. Sorry if I forgot anyone, feel free to invite them into this discussion.

Joomla's Cache is pretty much broken right now.

Each extension's (component or module) cache document has a head section. This is calculated in Cache::setWorkarounds from the diff of the HtmlDocument's state before the extension is rendered and the state after the extension is rendered.

In theory, it should give us the WebAssetManager assets, WebAssetManager registry files and legacy scripts and styles added by the extension just rendered to the document.

In practice, this does not actually happen. If an extension before me has added the same assets, registry files, or legacy scripts / styles the diff between the two states will NOT catch these common items which were already present in the document. This may happen, for example, if two or more modules use the same layout which loads a dependency; or if a module is using a component's assets on the same page the component is also rendered before it.

The problem is that if one of the cached assets is unpublished or used in a different page context it may lead to missing assets, missing registry files, or missing legacy scripts/styles. The result can vary from the site looking / acting wrong to throwing an exception from the WebAssetManager making it unusable.

Do you agree that blindly adding the ENTIRETY of whatever is in the Document's head after an extension is rendered is wrong because it is most definitely far more than the extension itself has added, therefore it would lead to a problem making it a non-solution?

Do you agree that the only way we can reliably identify what has been added by an extension being rendered is keep a log of the actions taken against the Document and the WebAssetManager, setting checkpoints before and after rendering each extension which would allow us to cache what the extension just rendered has done to the Document and replay it when we restore the cache document?

If the answer to the latter case is affirmative, would you like me to contribute that? If so, in which version of Joomla?

If it's other than 4.1, do you agree that my interim solution in #37123 to fix the module caching is better than #35776 so at the very least we can solve the issue I reported here which is affecting plenty of users? Moreover, which branch should I work on as I don't see a 4.2-dev or 5.0-dev right now?

brianteeman commented 2 years ago

Do you agree that

I wouldnt have a clue and I wouldnt pretend to. But I followed your test instructions, validated the bug and validated that the bug went away. I cant do more than that

I don't see a 4.2-dev

I know its there because I have submitted to it image

nikosdion commented 2 years ago

@brianteeman You are not a release lead. You have no authority to speak on behalf of the Joomla project. You made it very clear. I am ignoring everything you and Fedir have to say. I have pinged the release leads, I will wait for their official and final decision.

Denitz commented 2 years ago

Let me join the party, I am really happy that my October's code finally has got some attention :)

Regarding my https://github.com/joomla/joomla-cms/pull/35776:

@nikosdion

Nope, the other PR will not work. They’ve missed the fact that the assets key is an array of arrays and that the INTERNAL arrays is where the problem happens (you can’t array_diff them because web asset items can’t be converted to string).

Do you mean $headnow['assetManager']['assets']? Yes, it's array of types, and each type is array of items. My code checks them. A web Asset has a unique name, it's enough and should be enough to distinguish between items. If you have multiple scripts with the same name but different paths, it's invalid logic. A same script asset name should always define the same script path. Please advise more details, I am not clear:

foreach ($headnow['assetManager']['assets'] as $type => $typeAssets)
{
    foreach ($typeAssets as $assetName => $assetItem)
    {
        if (!isset($options['headerbefore']['assetManager']['assets'][$type][$assetName]))
        {
            $cached['head']['assetManager']['assets'][$type][$assetName] = $assetItem;
        }
    }
}

Moreover they fudge the registry files which is unnecessary, these are already handled correctly (and is part of the problem).

Please advise more details. Without caching registryFiles, we will get "Unsatisfied dependency" error.

And an idea for module caching:

  1. Before rendering the module doomed to be cached after render, we will store current document assets (OriginalDocument) and empty assets, now we have a "clear" document without any assets (FakeDocument).
  2. Module is rendered and populates FakeDocument with own assets.
  3. Module is cached. No diff of assets, module cache will contain ALL assets added ONLY by this module.
  4. FakeDocument assets are merged into OriginalDocument .
  5. Next module etc...

As a result, we have a current document will all assets, not duplicated. And we have module caches where each contains assets only if this module.

On rendering the cached modules, we merge the cached assets into current document - no changes here.

Denitz commented 2 years ago

Please check https://github.com/joomla/joomla-cms/pull/37139 with this better approach.