dotCMS / core

Headless/Hybrid Content Management System for Enterprises
http://dotcms.com
Other
864 stars 466 forks source link

Add list of url parameters to ignore when building page cache #23928

Open wezell opened 1 year ago

wezell commented 1 year ago

User Story

Currently, our page cache takes all parameters into account when building the key for page cache. This means that common SEO type tags which get passed all the time should be ignored and should not force churn in the page cache. Examples of such params are:

utm_medium  
utm_source  
utm_campaign
utm_content 
gclid           
fbclid      
utm_term        

Acceptance Criteria

  1. Create a new page and set the page cache super high, like to an hour, 3600
  2. Visit the page in an anonymous browser to load the cache
  3. from the back end, edit the page
  4. visit the page on the front end. You should see the page before your latest edit. Try adding the parameters above. You should still see the original version of the page.
  5. Add another random parameter, e.g. ?testingABC=345435. You should get the new/edited version of the page.

Proposed Objective

Application Performance

Proposed Priority

Priority 2 - Important

External Links... Slack Conversations, Support Tickets, Figma Designs, etc.

No response

Assumptions & Initiation Needs

No response

Sub-Tasks & Estimates

No response

yolabingo commented 10 months ago

This issue is significantly amplified by the fact that many mail clients and mail servers pre-fetch the URLs in email messages at the time the message is received. Since these URL parameters are used in marketing emails, and email marketing campaigns send out many messages at once, this results in a flood of cache-busting Page requests from many unique IPs received in a short time span.

Cloud support incidents https://dotcms.zendesk.com/agent/tickets/114223 https://dotcms.zendesk.com/agent/tickets/114621

yolabingo commented 8 months ago

another incident March 2024 https://dotcms.zendesk.com/agent/tickets/115386

wezell commented 8 months ago

I guess this would be important to configure on a site by site level.

wezell commented 8 months ago

Also would be good to be able to ignore by pattern. Things like utm_* and *clid would take care of a lot.