Adobe-Consulting-Services / acs-aem-commons

http://adobe-consulting-services.github.io/acs-aem-commons/
Apache License 2.0
454 stars 602 forks source link

Redirects Manager performance discussion thread #2554

Open davidjgonzalez opened 3 years ago

davidjgonzalez commented 3 years ago

This thread is a temporary place to discuss the performance of the new Redirect Manager. This is a splinter thread from:

https://github.com/Adobe-Consulting-Services/acs-aem-commons/pull/2549#issuecomment-800003158

davidjgonzalez commented 3 years ago

@YegorKozlov

Thanks for the info in https://github.com/Adobe-Consulting-Services/acs-aem-commons/pull/2549#issuecomment-800003158 ..

There's been a bit of interest in this feature for (IMO) very large redirect lists (50k+) that were previously managed by Redirect Map Manager

I created a random-redirects file with ~50k entries just to see what happened at this scale, and the management webpage died with too many calls (sling:includes for the rows IIRC, and just made everything very slugging).

For this actual use-case of 50k redirects, they're scattered over 250 domains (microsites) running on a single AEM Sites. I think this worked well in Redirect Map Manager because that lets you make discrete management pages PER domain.

Not sure if we could evolve this to do something like that -- organize by domain or something - which i believe is available to AEM CS via the X-Forwarded-Host request header. We'd probably want to think about it a bit more before making any changes - just food for thought.

Another idea might be to see if we could optionally hook up your Redirect Filter to Redirect Map Manager somehow for these more complex use-cases -- i think yours is probably the simpler for single-site AEM impls (and the inline search is sweet :))

test.xlsx

YegorKozlov commented 3 years ago

@davidjgonzalez In my projects I had around a 1K redirects and thought that was large :)

What if we paginate redirects? This will solve the too many sling:includes issue. Inline search can be ajaxified (or work on the current page only). Managing 50K+ redirects will still be a pain, but at least the page won't break. It's a low-hanging fix.

A better idea is to manage redirects per context, caconfig-style. Users will be able to define multiple lists and the filter will select one depending on the context. I'm liking this idea as I'm typing it :)

davidjgonzalez commented 3 years ago

no one expects ~the spanish inquisition~ 50k redirects :P

Yeh, pagination of redirects could def work - i guess search/filter would have to be on the backend (dont recall if thats clientside filtering right now - i love the filtering though)

I like that CAConfig, applying them to site-trees -- thinking it through.

Binding redirects to a "domain" would be replaced by binding them to a logical Web site tree in AEM, which should? have a 1:1 with a domain/sub-domain (I think?)

I like the CAConfig idea, but I struggle with not having a consistent framework/UI for building out and managing configurations UNDER /conf. IIRC there really isn't much provided by AEM to support this, and then 90% of the feature's code ends up building out the CAConfig authoring experience. I know wcm.io has some stuff for this, but ACS Commons tries to depend on 3rd party dependencies (one of the original project tenants).

That said, I like what you're thinking.

YegorKozlov commented 3 years ago

The PR is coming soon, I already have a working prototype.

I hate to introduce incompatible changes, but it looks like I have to: the default caconfig home is /conf/global which means I will need to move existing redirects from /conf/acs-commons to /conf/global. Users will be warned in the UI.

The UI will be split into two parts:

  1. manage contextual roots (new feature). It will be a list of known redirect configurations and a dialog to create a new one, e.g. create a /conf/my-site/settings/redirects node. I wanted to re-use the OOB Configuration Browser (http://localhost:4502/libs/granite/configurations/content/view.html/conf) but it is not extendable and marked with granite:InternalArea.
  2. edit redirects. We can leverage the existing component. The path will be explicitly passed in the url, e.g. http://localhost:4502/apps/acs-commons/content/redirect-manager.html/conf/global/settings/redirects The user experience will not change. You will be able to edit/import/export the rules and it will apply to the conf root passed in the url.

This will allow creating different configurations for each site, for example, we-retail would have its own table of redirects, any other site will fallback to /conf/global :

/content/we-retail
    + jcr:content
      - cq:conf = "/conf/we-retail"

/conf/we-retail
    + settings
      + redirects
        + rule-1
            - source = "/content/we-retail/page1"
            - target = "/content/we-retail/page2"
            - statusCode = 302

If a user does not care about caconfig they can put all the redirects in /conf/global/settings/redirects. This node will be created by a repo-init script on install and it's a good start.

To resolve a caconfig the code will need to go up the content tree until a resource with the _cq:conf_property is found. I org.apache.sling.caconfig.resource.ConfigurationResourceResolver can do that. Once the redirect configuration is resolved, it will be cached in memory and requests will be matched against a smaller table, specific to the context.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

tech-arch-forum commented 4 months ago

Eventually, what''s out come. Adapting this will not have any performance hit on publishers?

YegorKozlov commented 4 months ago

@tech-arch-forum Redirect Manager can handle large (10K+ ) collections of redirects just fine. Evaluation of redirects is fast and in most cases it's a lookup in a memcache which costs O(1) Authoring is also optimized to handle large collections. Redirects are paginated and the tool provides search support.

davidjgonzalez commented 4 months ago

+1 the key with the runtime performance here is the cost to process redirects scales O(N) with the number of pattern matches, non-pattern are O(1) ..

The Authoring UI was reworked a few versions back to support many (tested 50k and was fine) redirects as well (via pagination).