wieni / wmcontroller

Adds support for bundle-specific controllers for Drupal 8 entities.
MIT License
3 stars 3 forks source link

Cache: wm_page menu-active-trail woes #22

Closed frizinak closed 7 years ago

frizinak commented 7 years ago

menu item links as stored as fully resolved paths (i.e node/5) and have no wildcard capabilities.

so both node/5 and node/5?page=3 will have their active trail set correctly.

node/5/1 and node/5/2 are different paths => different menu items => different active trails.

the backend menu-item form is terrible and allows saving abc/5 even if the route entry path = abc/{nid}/{id}/{page} as all but the former param have defaults. While storing this path along with its params it'll proceed to completely ignore those with a default value.

since we moved ?page= into the route params, editors would need to enter paged-route/[id]/0 and lose the active trail on page 2.

I prefer a core patch (yet have no idea how)
@RobinHoutevelts feels hardcoding the path, without altering the routes in wmcontroller_preprocess_pager is the way to go.

Options/Opinions ?

RobinHoutevelts commented 7 years ago

@RobinHoutevelts feels hardcoding the path, without altering the routes in wmcontroller_preprocess_pager is the way to go.

So what I suggest is we let drupal create a paged url for example /taxonomy_term/5?page=1 and simply do str_replace('?page=', '/', $url) in order to get /taxonomy_term/5/1.

A client then visits /taxonomy_term/5/1, which results in a 404 not found. To combat this we need to have an early Middleware that takes this Request, does $request->query->set('page', 1); and changes the path into /taxonomy_term/5 before passing it to the next Middleware.

frizinak commented 7 years ago

it wont trigger a 404, so a KernelEvents::CONTROLLER would suffice.

(I lied that's d7)

Imo the issue is not of how tiny the change is but how 'dirty' it feels, as we're kinda trying to bypass something that is very lacking in core.

spoit commented 7 years ago
frizinak commented 7 years ago

Unrelated: can we change this to /term/14/page-2 instead of just the number; I think it's better for readability and easier to pick up in code.. no?

we can, makes no difference code-wise

Unrelated 2: I still am not following completely why we can't let the CDN cache selected query strings as separate paths. MaxCDN seems to have options for this (EdgeRules / Cache busting) and I have an unpleasant feeling that the query string tiger will come back to bit us.

frontend ?page is the only query param I know of in d7, edge rules cost a ton of moneyz.

Robin's choice looks good to me since it seems to contain our stuff a bit (eg without patching core)

there is an actual 3 yo issue that is yet to be assigned on the drupal issue tracker related to this issue about how bad the current implementation is in core, so we shouldn't be shy of patching. The only thing that would kind of argue against this is that composer-patches doesn't work (or at least out of the box) for libs, so we'd need every site to include the patch in the project composer.json (but since we already do this in wmmodel...)

Oh and the fact that I have no clue how to fix it. (add a route attribute that specifies 'ignorable params' patch core so those are ignored in the entire menu system... cause bug in something seemingly unrelated, but have a more robust and clean fix? Iunno)

But since it's an 'urgent' bug that needs fixing I'm fine with robins solution, just not fully behind it ;)

I'd even argue to rewrite menus (with breadcrumbs etc) but that'll get shot down in no time ;) (core ,menu's are just terrible, especially if you have multiple languages enabled..., but yeh unrelated)

We have to keep it in mind that backend keep extended use of query strings, so we can't break things too hard there (especially for backend pagination eg views, content overviews, ...)

wmcontroller doesn't touch the backend unless you add an admin route to the pager_routes setting, if you then have issues, you deserve em

Clients should still be able to enter /term/14 in menu backend, since the pageis totally optional

yeh if entering /term/14/0 was the solution (which it isn't as term/14/1 would still be broken) we could alter this during form_submit.

spoit commented 7 years ago

Right, I forgot -again- it's only about wmcontroller-controlled paths.

I don't care about patching core and putting it manually into the root composer.json if it's the better solution. Any hope we might submit that patch as a bug fix in core, then? :)

frizinak commented 7 years ago

Proposal:

in public/core/lib/Drupal/Core/Menu/MenuTreeStorage.php:663 (loadByRoute) and :372. (preSave)

if an entry in $route_parameters has a default value and the current value matches that, trim it

I can't see any immediate side effects, but then again I never foresaw this issue either when writing this cache impl. ;)

frizinak commented 7 years ago

well that idea was obviously flawed (as @RobinHoutevelts pointed out) since we need the params for node/5/2 to also be node=5

which would mean stripping default params regardless of their value, so a route like /entity/{entity_type}/{id} wont have a correct active trail if entity_type has a default value. sigh

spoit commented 7 years ago

(Again nagging, sorry). Maybe we should keep the query params as is

If the site you're building is going to be behind a CDN, don't let wmcontroller try to 'fix' this but fix it yourself in controller and other places.

I (still) would much prefer to be able to use normal query params, caching be damned

frizinak commented 7 years ago

nah don't agree, you need wmcontroller even if the site is behind a cdn, who's gonna do the purging?

dako and one of our upcoming projects are gonna be behind a cdn, are we gonna fix this twice or create wm_fix_core_menu_and_pager_param.module while it clearly belongs in wmcontroller?

@RobinHoutevelts you feel like implementing your fix? cause I don't see a way forward in core, btw: you'd also be fixing paged aliases since we wont be modifying routes anymore.

ps: don't str_repl like you suggested but parse the url, the overhead is tiny. too bad outbound alter doesn't allow us to strip query params cause we could make this configurable (url params = <original-path>/{page}/{filter-country} could imply rewriting <original-path>?page=1&filter-country=be => <original-path>/1/be)

frizinak commented 7 years ago

I (still) would much prefer to be able to use normal query params, caching be damned

when would you need this and when wouldn't route params suffice? A complex filter page

?filter1=be&filter2=200-300&filter3=term-4 pages like those shouldn't have max age, the amount of possible combinations is too high, new items will need to appear fast, so either purge a generic tag that purges the thousands of combinations or have a low maxage, either way you'll be killing your site if that page isn't fast in and of itself. so no max-age, 1 filter query, cached rendering of individual teasers is the way to go (like canvas.be/videos when that was still a thing)

ghost commented 7 years ago

First: I'm stupid and I am just now walking into the room...

Could we just use url parameters like normal not try to duct tape ?page=3 to /3/ as wm_page. Then cache url parameters like normal/varnish/drupal/etc... Then put a fallback safety or whatever to ensure that yes no bot can blow up our cache table.

Ex,

"IP address can't create more X cache entries in Y minutes" "No more than Z cache entries period" Other nice secuirty measure?

Why?

Any site any controller/page that has a form/filterer/etc needs to put vars on the url and if they too in order to use cache have to use the /x/y/z/o scheme;

1) The menu active trail has to be considered every where A-Z for every scenario / page under the root.

2) Breadcrumbs also become a problem and have to be customized.

3) Taxonomy Terms in the menu start to have issues.

ghost commented 7 years ago

Another reason why we can't encode filter params into url segments is because that various pages and controllers are working on top of each other. Client wants a specific url structure, etc...

So:

/books-and-authors    -> booksAndAuthorsController (filters)
/books-and-authors/genre_name  -> genresController

As example... if the top route has to put "science-fiction" as a filter. you could then be jumping to another route.

ex,

/books-and-authors/science-fiction   -> ??????
vs.
/books-and-authors/science-fiction   -> ??????

Who is the controller now? Whereas,

/books-and-authors?filter=science-fiction  -> booksAndAuthorsController
/books-and-authors/science-fiction -> genresController

needs to/should be possible.

frizinak commented 7 years ago

Then put a fallback safety or whatever to ensure that yes no bot can blow up our cache table.

how if not with url params, are you gonna add the query-string validation. Or should we just use drupals builtin param requirements?

the filter argument is invalid imo, see my previous comment, complex query string pages have no business sitting in our cache table.

the active trail issue, which is the topic of this issue is because the menu system is terribly inflexible (look at the @todo comments in MenuTreeStorage) and NOT a side effect of altering routes.

not sure I understand your second comment. Where does a path-param define the conroller that is being used?

/books-and-author/{genre} doesn't that solve it?

ghost commented 7 years ago

So the list of translators that can translate in to "Catalan" should not be cached?

https://flandersliterature.wieni.work/translators/18/0

or

Trips that go to Europe

https://www.joker.be/nl/search?continent=EU&country=

As a quick measure I would suggest maybe never cache any page that has query params. Because that is actually the problem.

The base url:

https://flandersliterature.wieni.work/translators

is being cache and if we then do

https://flandersliterature.wieni.work/translators?page=4
or
https://flandersliterature.wieni.work/translators?language=18

We still are seeing the page 0. So for today maybe say "hey got url params?" NO CACHE! At least then routes, menu, breadcrumbs, etc... comes back in order? We get cache hits for the majority of things and url parameters can behave regular behavior again.

Please feel free to smack me if I am completely wrong here.

frizinak commented 7 years ago

I'm not seeing the point of this, we fixed the page 0 issue with wm_page, agreed? this works like it's supposed to right? caching per page...

We then stumbled on the lacking drupal core menu active trail stuff.

Again modifying a route by adding parameters shouldn't break core like it does, saying we should drop functionality, along with reducing cdn cost is not an option imo.

ps: translators/18/0 or translators/europe/belgium is a page that should get cached, using query params makes no sense, but this is entirely not the scope of this issue.

ghost commented 7 years ago

The problem is not that whether or not the url is good looking.

The problem is that:

http://blahblahblah.com/something

is cached as it should be

http://blahblahblah.com/something?holysmokes=boobs

not only is not being cache (debatable fair enough, i could not care less actually) but the real problem is is that wmcontroller is stripping off ?holysmokes=boobs and then reverting back to http://blahblahblah.com/something and sending that back. That is the thing that is upsetting people page=2 is the same as page=4 same as page=0, filter=XYZ shows the same as no filter=XYZ

RobinHoutevelts commented 7 years ago

I'm not seeing the point of this, we fixed the page 0 issue with wm_page, agreed? this works like it's supposed to right? caching per page...

Agreed, let me try and implement what I said before and see if this fixes both issues. ( paging and menu link trail )

The real problem is is that wmcontroller is stripping off ?holysmokes=boobs and then reverting back to http://blahblahblah.com/something and sending that back.

Wmcontroller is doing none of that.

ghost commented 7 years ago

then why is

http://example.whatever/books

showing the same as

http://example.whatever/books?mystupidfilter=cats

What is disregarding ?mystupidfilter=cats and returning the cache based version of the url?

frizinak commented 7 years ago

don't edit your comments plz :D our responses now make less sense.

yes you use wmcontroller.cache, you lose query params, as stated in the readme. This is not the issue at hand, it's only ?page= rewriting that triggers a 'bug' MenuTreeStorage

you have a genuine need for query params? max-age=0, performance should be optimized on a different level (dynamic page cache, redis, whatever).

I don't care about prettyness of the url either but that is not the point, we're ignoring params to avoid cache explosions since param validation is easier (and already there), and more importantly, make it compatible with cdns without having to pay for cache explosions (along with the edge rules feature cost)

"IP address can't create more X cache entries in Y minutes"

will do f-all for complex filter pages

"No more than Z cache entries period"

which keys are you gonna drop? slowest pages first? scraping the site will create the same load as a site without any cache whatsoever.

That's a whole bunch of compromises to just fix menu trail no?

ghost commented 7 years ago

monday at 1 pm flanders goes online. please have a working solution. If you need my help I am on the whatsapp and jump in sunday.

frizinak commented 7 years ago

Disable .store and .tags and add cache tags manually

ghost commented 7 years ago

I think i have a solution if it works I'll share.

ghost commented 7 years ago

just to be sure the line: $route->addRequirements([self::ROUTE_PARAM => '\d+']);

could it be that this line is the one that force the wm_page into the menuTreeStorage?

ghost commented 7 years ago

ok this makes no sense, on my local at home I rebuilt the flanders site from a raw db. The alter routes of wmcontroller finally works correct and instead of getting ?wm_page=x I get /1 /2 /3. Even more impressive is that other optional parameters ALSO get considered so:

/books-and-authors/include/14/2
/books-and-authors/include/14/3
/books-and-authors/include/14/4
...

Because this works correctly things make sense ^now^.

To fix the menu active ^NOW^ it becomes SUPER EASY.

        $menu = 'main';
        // Get the current route name.
        $current_route = $this->currentRouteMatch->getRouteName();
        // Get the defaults.
        $default_parameters = $this->currentRouteMatch->getRouteObject()->getDefaults();
        // Remove keys that start with '_'.
        // _title _controller
        foreach ($default_parameters as $k => $v) {
            if (strpos($k, "_") === 0) {
                unset($default_parameters[$k]);
            }
        }
        $links = $this->linkManager->loadLinksByRoute($current_route, $default_parameters, $menu);
        $link = reset($links);
        $parentLinkId = $link ? $link->getPluginId() : '';

        // Set the active trail to the default scenario of this route.
        // Which in theory should be the base no parameters version.
        // That the non technical user probably put into the menu.

        $parameters->setActiveTrail([$parentLinkId => $parentLinkId, '' => '']);
        $parameters->addExpandedParents([$parentLinkId => $parentLinkId]);
frizinak commented 7 years ago

instead of getting ?wm_page=x I get /1 /2 /3.

not sure I understand this correctly but your pager was linking to ?wm_page=1 ? The route alter didn't kick in then.

To fix the menu active ^NOW^ it becomes SUPER EASY.

Well we'll have to see if there is a hook where we can alter Url::buildQuery (if I remember correctly). an unset of just wm_page would suffice. (see this comment)

ghost commented 7 years ago

Yes the alter wasn't kicking in. This was the confusing part and the source of all frustration.

Now for even more confusing...

When I bring in the DB of staging. The alter route goes away. If I import with the mysql CLI the import works, when I import with Sequel Pro I get a UTF-8 Error. So ...

When I delete the db, re-create it as normal UTF-8 and not UTF-8 mb4.... Then iconv the .sql file to just UTF-8 and then import file, the alter route works! There is something in the db. I believe that Robbin also had the issue when he brought the project over to his machine.

So after all this I will admit I still have some doubts but I am onboard the "No more query params" train.

frizinak commented 7 years ago

ok so your issues are pretty unrelated, the utf8 error is a sequelpro thing, just import using mysql cli or set import encoding to whatever mysqldump defaults to.

you ran drush cr (just making sure)? can't imagine this failing depending on the database you're importing.

Ontopic: your manual active trail fix might work, so i'll see if I can find an alter.

frizinak commented 7 years ago

Still relevant, but closing as we got off track quite a bit.

Create a new issue if we encounter this again.