explainers-by-googlers / reduce-accept-language

This repository hosts explainer for reducing passive fingerprinting in the Accept-Language header.
Creative Commons Attribution 4.0 International
16 stars 0 forks source link

Clarify how this interacts with redirects? #3

Closed birtles closed 2 years ago

birtles commented 2 years ago

Hi,

I posted this same question over in Discourse but didn't receive a response so I'm submitting it here too.

Perhaps the answer is obvious but I was wondering if this mechanism is expected to apply equally to redirects?

For example, I have a site whose root page redirects based on the first matching language in the Accept-Language header.

e.g. https://mysite.org/ redirects to either https://mysite.org/en/ or https://mysite.org/ja/

This allows linking to or bookmarking a specific language-version site since the user may choose to navigate to an alternate language based on their preferences.

Is the proposal here that the 301 redirect response itself includes the Content-Language / Vary / Variants headers?

Tanych commented 2 years ago

@birtles Hi, it's really a good question. I will bring back to you with more details once I have.

birtles commented 2 years ago

Hi, it's really a good question. I will bring back to you with more details once I have.

Thank you!

Tanych commented 2 years ago

@birtles Sorry for the late response. For the HTTP redirects response, browsers itself won't modify the headers and include Content-Language/Vary/Variants headers . Basically, we have two options: either let sites include or not include Variants headers in the redirects response.

For example: https://mysite.org/ redirects to https://mysite.org/en/

GET / HTTP/1.1
Host: example.com
Accept-Language: en

HTTP/1.1 301 Moved Permanently
Location: mysite.org/en
Content-Language: en
Vary: Accept-Language
Variants: Accept-Language=en

Browser will resend the request to example.com/en with the user's major accept-language.

GET /en HTTP/1.1
Host: example.com
Accept-Language: en

HTTP/1.1 200 OK
Content-Language: en
Vary: Accept-Language
Variants: Accept-Language=en

For example: https://mysite.org/ redirects to https://mysite.org/mult-lang/

GET / HTTP/1.1
Host: example.com
Accept-Language: en

HTTP/1.1 301 Moved Permanently
Location: example.com/multi-lang

​​Browser sends the request with major accept-language en since browser doesn't the support language list for example.com/multi-lang.

GET /multi-lang HTTP/1.1
Host: mysite.org
Accept-Language: en

HTTP/1.1 200 OK
Content-Language: fr
Vary: Accept-Language
Variants: Accept-Language=(fr de)

Browser will resend the request which might cause additional cycles.

GET /multi-lang HTTP/1.1
Host: example.com
Accept-Language: de

HTTP/1.1 200 OK
Content-Language: de
Vary: Accept-Language
Variants: Accept-Language=(fr de)

For now, we prefer sites include the Content-Language / Vary / Variants headers in the redirects response since latency is unacceptable for most of web applications.

birtles commented 2 years ago

Hi @Tanych,

Thanks for your reply. Yes, I agree that the browser should handle Content-Language / Vary / Variants being specified on the redirect.

To clarify that I have understood your explanation properly, please let me provide my own example.

In this example, the site provides redirects to either /en or /ja based on the first matching Accept-Language, defaulting to /en if none match.

The user has configured their browser with the following preferred languages:

Initial request and response:

GET / HTTP/1.1
Host: mysite.org
Accept-Language: zh-CN

HTTP/1.1 301 Moved Permanently
Location: /en
Content-Language: en
Vary: Accept-Language
Variants: Accept-Language=(en ja)

The browser sees that the Content-Language doesn't match any of the user's accepted languages, but does offer one (ja) so it re-sends the request as:

GET / HTTP/1.1
Host: mysite.org
Accept-Language: ja

And this time it receives the response:

HTTP/1.1 301 Moved Permanently
Location: /ja
Content-Language: ja
Vary: Accept-Language
Variants: Accept-Language=(en ja)

The browser then requests the redirected resource:

GET /ja HTTP/1.1
Host: mysite.org
Accept-Language: ja

HTTP/1.1 200 OK
# No need for `Content-Language`, `Vary`, `Variants` here since this is
# a monolingual resource although the returned page will ideally
# include the following content:
#
# <link rel="alternate" href="https://mysite.org/en/" hreflang="en" />
# <link rel="alternate" href="https://mysite.org/ja/" hreflang="ja" />

Does that sound about right?

Tanych commented 2 years ago

Hi @birtles, yea, this sounds right to me. it needs sites to change behavior to handle the HTTP redirects, also browser will use the variants header to redirect to the right page with new Accept-Language header. I will let you know if there are additional comments regarding this.

Tanych commented 2 years ago

mark as closed since we confirm the redirects question.