NCIOCPL / cgov-digital-platform

The Cancer.gov Digital Communications Platform
GNU General Public License v2.0
11 stars 33 forks source link

UTM query params stripped from URL when redirected #2800

Open welshja opened 4 years ago

welshja commented 4 years ago

Issue description

The expectation for redirects is that any query parameters are carried over with the redirect. This largely seems to be the case, but I'm noticing that UTM parameters are stripped off with a redirect.

If you look at the three URLs below, the redirect will carry the query param across the redirect except for the last one. https://www.cancer.gov/contact/emergency-preparedness/coronavirus?cid=test https://www.cancer.gov/contact/emergency-preparedness/coronavirus?chicken=test https://www.cancer.gov/contact/emergency-preparedness/coronavirus?utm_source=test

Please investigate and update so that UTM codes are passed through with the redirect. The UTM parameters are utm_source, utm_medium, utm_campaign, utm_content and utm_term.

ESTIMATE

Steps to reproduce the issue

  1. Go to https://www.cancer.gov/contact/emergency-preparedness/coronavirus?utm_source=test
  2. Redirect occurs
  3. Inspect URL that no longer has utm parameter attached.

What's the expected result?

Redirected URL should maintain the query parameters through the redirect process.

What's the actual result?

User is redirected to https://www.cancer.gov/about-cancer/coronavirus/coronavirus-cancer-patient-information with no query parameter attached.

blairlearn commented 11 months ago

This is a "feature" of the Acquia hosting platform.

From: https://acquia.my.site.com/s/article/360022954014-Handling-redirects-while-keeping-web-analytics-UTM-query-parameters

as a means to maximize performance via the Varnish caching layer, removes web analytics parameters from URLs (for example, Google Analytics' utm_source and others).

See also https://docs.acquia.com/cloud-platform/performance/varnish/querystrings/

The suggested solution is to use the drupal/acquia_analytics_redirects module.

This issue does NOT occur on a local www.devbox build.

andyvanavery31 commented 10 months ago

Team to determine if this can be done in Akamai.

navsunka commented 9 months ago
blairlearn commented 9 months ago

This needs to be done in Akamai, not in code. Letting the utm_whatever parameter go through to code will cause each separate value to result in a new page generation, which is not what we want.

blairlearn commented 9 months ago

In general, this will need to be done with a conditional rule that looks for utm_whatever parameters (either utm_* or an enumerated list of parameters we care about.)

Property Manger Idea 1

Create a variable containing the values of all the specific parameter names we care about. e.g. utm_source=test&utm_medium=&utm_campaign=123&utm_content=456&utm_term=. Then, if the response is a redirect, rewrite the Location header, with logic to handle the cases where it does or does not contain query parameters.

Cons:

Property Manager Idea 2

If the response status code is a redirect, then Use the Modify Outgoing Response Header behavior to change the Location header.

Edgeworker

Investigate the creation of an EdgeWorker to extract the utm_* parameters from the request and append them to the Location header.

Cons:

Bonus Workaround Approach: Edge Redirector

This would be annoying, but if there's a redirection which needs analytics, it may be possible to use Edge Redirector which doesn't have this issue. (Note: Edge Redirector has a limit of 5,000 rules per policy, which limits the ability to simply move all existing redirects en masse.)

blairlearn commented 4 months ago

Idea # 5 - Capture query parameters

This is a refinement of "Property Manager Idea 2" (above).

Specifically, the redirection module will no longer be allowed to redirect to targets which have query parameters. Redirection targets with query parameters can be implemented in Edge Redirector, which has the ability to include query parameters.

This would presently affect 28 redirection rules for the cancer currents and temas-y-relatos blogs (e.g. news-events/cancer-currents-blog/biology redirects to /news-events/cancer-currents-blog?topic=biology)

This leads to an "Analytics Query Params" Akamai rule along the lines of

  1. if Response Status Code is one of 301, 302, 307
  2. Use the "Set Variable" behavior to extract the value of the Location response header.
  3. Use the Modify Outgoing Response Header to replace the entire value of the Location with the entire extracted value, a ? character, and the built-in {{AK_QUERY}} variable.