deltachat / deltachat-pages

Delta Chat Website
https://delta.chat/
GNU General Public License v3.0
69 stars 64 forks source link

language redirect errors #618

Closed r10s closed 1 year ago

r10s commented 1 year ago

i think, there is somehow an en to much in redirects, in the past we had recursive redirects as en/en/en/en/en, that is stopped with a hack meanwhile, but https://delta.chat/se/ redirecting to https://delta.chat/en/en/se/ is still wrong.

the logic seems to live at https://github.com/deltachat/sysadmin/blob/beb0307a873529e9e739f951ce8df672f1a088fe/delta.chat/delta.chat#L81 - on testing, one should be aware of browser caches

iirc, there were other issues filed before but i cannot find them somehow

ralphtheninja commented 1 year ago

Ah yes, I forgot to create the issue I promised ya'll. We were trying to install delta chat on iOS on two different devices. Both had problems getting to the website because nginx returned 403 errors. This is really a shame, since getting to a webpage shouldn't be hindered by the server itself ;)

Anyway. My suggestion would be to remove all that weird redirect complexity (or fix it) and have all requests land on https://delta.chat/en by default and instead add a language picker to the page so the person can pick their language of choice themselves. The server can assume less and just return what's asked for, like a server should and we can go about our day.

gerryfrancis commented 1 year ago

iirc, there were other issues filed before but i cannot find them somehow

This one is still open: https://github.com/deltachat/deltachat-pages/issues/541

Also maybe check the right order of RewriteCond and RewriteRule (if they exist) in the .HTACCESS file. (It is just a guess.)

r10s commented 1 year ago

Also maybe check the right order of RewriteCond and RewriteRule (if they exist) in the .HTACCESS file. (It is just a guess.)

i think, this is mostly related to apache - where delta.chat is using nginx

the logic is linked above, and i think, that should be simplified. quite some of the lines are hacks to stop endless redirects - if we fix the underlying issue, that can go away. the concatenation of en..request_uri looks suspicious to me, as the old language path is not stripped. i think, the whole language-logic should operate with the last path element only, and/or only when there is no path set at all (so, only when delta.chat is opened)

however, this needs to be tried out by someone with access to a server, it does not make too much sense to judge without having access to the code and being able to try things out :)

gerryfrancis commented 1 year ago

i think, the whole language-logic should operate with the last path element only, and/or only when there is no path set at all (so, only when delta.chat is opened)

@r10s The last path element can also be the name of a site, e.g. https://delta.chat/de/blog .

missytake commented 1 year ago

Technically, delta.chat/se/ returning a 404 error is correct, as we don't support this language afaict? Yes, the /en/en in the URI looks weird, but who pays attention to that?

One problem with the lua copy-paste code which handles the language redirects is that it is quite unclear which inputs it needs to handle. The content of the Accept-Language header can be a lot of things. And changing it can make things fail which worked in the past.

I think we maybe need some test script which defines which headers are supposed to produce which result, then identifying which specific case does not work, and then we can start fixing the bug. Until we have somthing like this I'm a bit scared of breaking it further - right now, at least it works for most people afaict.

I do agree it's bad and needs attention.

ralphtheninja commented 1 year ago

Until we have somthing like this I'm a bit scared of breaking it further - right now, at least it works for most people afaict.

Most people? Pretty much everyone in Sweden and Finland have the same problem if they are on iOS. What are you afraid of breaking? The bug? :smile:

ralphtheninja commented 1 year ago

Step one. Set up an additional site on another domain, then we can screw around and test things without breaking the current site.

gerryfrancis commented 1 year ago

In my browser (Firefox), German is the default language.

I enter https://delta.chat/blog in the address bar, and I am redirected to https://delta.chat/de/blog , which is correct. I enter https://delta.chat/blog/ in the address bar, and I get a redirection error ( https://delta.chat/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/blog/ ).

missytake commented 1 year ago

Until we have somthing like this I'm a bit scared of breaking it further - right now, at least it works for most people afaict.

Most people? Pretty much everyone in Sweden and Finland have the same problem if they are on iOS. What are you afraid of breaking? The bug? smile

If you have swedish as browser language, you should see english. If you have finish as browser language, it should redirect you to https://delta.chat/sq/ (that's suomi, no?). If you have swedish as first language and finnish as second, but no english, I think it should still show you suomi...

Can you show me specific language settings / an Accept-Language header that can't access https://delta.chat right now? Or where something else unexpected happens?

missytake commented 1 year ago

In my browser (Firefox), German is the default language.

I enter https://delta.chat/blog in the address bar, and I am redirected to https://delta.chat/de/blog , which is correct. I enter https://delta.chat/blog/ in the address bar, and I get a redirection error ( https://delta.chat/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/blog/ ).

Yes, that's why we don't link to /blog/ anywhere, only to /blog... (and I think we even only link to blog because of this behavior, so if you are already in a specific language you stay within it)

r10s commented 1 year ago

https://delta.chat/sq/ (that's suomi, no?)

no, sq is Albanian (Shqip)

missytake commented 1 year ago

Step one. Set up an additional site on another domain, then we can screw around and test things without breaking the current site.

It's fine to try out code quickly in production - but it's necessary to define some examples of what's supposed to work, so we can try out all of these cases after a change. I will start to define some cases and collect them here.

ralphtheninja commented 1 year ago

Can you show me specific language settings / an Accept-Language header that can't access https://delta.chat/ right now? Or where something else unexpected happens?

Unfortunately I can't. I no longer have access to the devices I found the errors on. Three different devices, two running iOS and one iPad. All of them swedish afaik.

You could check the access logs in nginx as well. There should be plenty of 403s there which can give you some more information.

missytake commented 1 year ago

Not super verbose unfortunately:

[20/Feb/2023:00:10:27 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36"
[20/Feb/2023:00:46:21 +0000] "GET / HTTP/1.1" 403 143 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
[20/Feb/2023:06:09:26 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36"
[20/Feb/2023:07:46:01 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Linux; U; Android 7.1.1; zh-cn; MX6 Build/NMF26O) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.132 MQQBrowser/8.7 Mobile Safari/537.36"
[20/Feb/2023:07:50:00 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:07:50:46 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/601.1.27 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/601.1.27"
[20/Feb/2023:07:50:52 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:07:53:26 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/601.1.27 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/601.1.27"
[20/Feb/2023:08:15:15 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36"
[20/Feb/2023:08:43:11 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/601.1.27 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/601.1.27"
[20/Feb/2023:08:45:49 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/601.1.27 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/601.1.27"
[20/Feb/2023:08:47:00 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Linux; Android 6.0.1; SM-A8000 Build/MMB29M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/53.0.2785.49 Mobile MQQBrowser/6.2 TBS/043305 Safari/537.36 MicroMessenger/6.5.8.1060 NetType/4G Language/zh_CN"
[20/Feb/2023:08:47:11 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:08:48:45 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:08:49:48 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Linux; Android 8.0; SM-G9500 Build/R16NW; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.132 MQQBrowser/6.2 TBS/044208 Mobile Safari/537.36 MicroMessenger/6.7.2.1340(0x2607023A) NetType/WIFI Language/zh_CN"
[20/Feb/2023:08:54:29 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Linux; U; Android 8.0.0; zh-cn; MIX Build/OPR1.170623.032) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/61.0.3163.128 Mobile Safari/537.36 XiaoMi/MiuiBrowser/9.5.14"
[20/Feb/2023:08:54:43 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:08:54:51 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:08:57:11 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:08:57:25 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/601.1.27 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/601.1.27"
[20/Feb/2023:08:57:27 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:08:58:13 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/601.1.27 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/601.1.27"
[20/Feb/2023:09:02:32 +0000] "GET / HTTP/1.1" 403 199 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/601.1.27 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/601.1.27"
[20/Feb/2023:09:03:10 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
[20/Feb/2023:09:06:36 +0000] "GET / HTTP/1.1" 403 143 "-" "Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)"
missytake commented 1 year ago

I can reproduce the error - just having an Accept-Language: se; header gives you a 403 error. Wild.

ralphtheninja commented 1 year ago

Sweet! Note that you can also define your custom log format in nginx. Could be useful later. Need to restart though.

missytake commented 1 year ago

The test script lives here: https://github.com/deltachat/sysadmin/blob/master/delta.chat/test-redirects.py

I already tried out https://stackoverflow.com/a/25137080, but that just seems to redirect everything to /en/.

r10s commented 1 year ago

may it be that at the end of the config at https://github.com/deltachat/sysadmin/blob/beb0307a873529e9e739f951ce8df672f1a088fe/delta.chat/delta.chat#L115 sth. as the following is missing:

ngx.redirect("https://delta.chat/en/"..ngx.var.request_uri, 301)

i mean the for lang in (ngx.var.http_accept_language .. ","):gmatch("([^,]*),") do does not have a match for eg. se - and then, there is just nothing more done and @lang is the last try in location.

missytake commented 1 year ago

Ah no - you're correct actually! That fixes the 403.

r10s commented 1 year ago

great, thanks for trying out! then maybe let's just add the line, and push that to master. fixed :)

at some point, we can cleanup there or you can try out other things - i think, the code is doing too much, also i do not see why we have two times defined the accepted-languages - but all that requires some more deep understaning of what the different nginx modules as "map" are doing, how $lang is defined and so on ...

missytake commented 1 year ago

The only thing that doesn't work yet is the redirect to chinese:

ERROR: redirected to https://delta.chat/en/, not to https://delta.chat/zh_CN/
Headers:
Accept-Language: zh-CN
---
ERROR: redirected to https://delta.chat/en/, not to https://delta.chat/zh_CN/
Headers:
Accept-Language: zh
---
missytake commented 1 year ago

great, thanks for trying out! then maybe let's just add the line, and push that to master. fixed :)

at some point, we can cleanup there or you can try out other things - i think, the code is doing too much, also i do not see why we have two times defined the accepted-languages - but all that requires some more deep understaning of what the different nginx modules as "map" are doing, how $lang is defined and so on ...

thanks for actually fixing it :grin: but yes, I'll push to master

r10s commented 1 year ago

The only thing that doesn't work yet is the redirect to chinese:

i think, this is because the language set by the browser by the Accept-Language: header (and then somehow forwarded to $lang and @lang) is zh-CN (with minus) while our files are zh_CN (with underscore). i'd accept both variants and forward to https://delta.chat/zh_CN

fun fact: language codes are full of tiny pitfalls: did you know that zh_CN is not only zh-CN sometimes but also zh-rCN on android [outside browsers] and zh-Hans on ios [outside browsers] ? :) EDIT: to clarify, in http-Accept-Language headers, only zh-CN is used, also on android and ios browsers

missytake commented 1 year ago

I found a solution! Everything in the script works now. Let's see if further reports of language redirect issues appear :see_no_evil:

missytake commented 1 year ago

(It's rather a hack than a solution tbh, and we will run into problems as soon as we add different chinese translations than simplified chinese... let's see^^)

ralphtheninja commented 1 year ago

So it's live now? I could have my friends try it out again for testing.

missytake commented 1 year ago

it is :) go for it

r10s commented 1 year ago

thanks for fixing! for the zh-hack, that seems quite okay, i did a minor pr for that at https://github.com/deltachat/sysadmin/pull/4/files - i did not try that out, however, so please double-check :) and maybe run your testing script :)

gerryfrancis commented 1 year ago

I still get a redirection error in Firefox (for Desktop and Android) when I enter https://delta.chat/se in the address bar, so I believe that this issue still exists. (Caches have been cleared before.)

missytake commented 1 year ago

I still get a redirection error in Firefox (for Desktop and Android) when I enter https://delta.chat/se in the address bar, so I believe that this issue still exists. (Caches have been cleared before.)

Well, you get a 404 error^^ because the page doesn't exist. Swedish is not supported yet, we don't have enough translations for swedish.

gerryfrancis commented 1 year ago

@missytake Understood, but when a language is not available, yet, we should redirect to English content. A redirection error in conjunction with a weird URL is extremely bad behavior and would confuse many people.

r10s commented 1 year ago

users will not enter arbitrary urls of some languages to the browser entry field - and if if they do, it is good practise and expected to return a 404.
this is also what others are doing, eg. try to replace de at https://signal.org/de/ by an unsupported language as li.

and in case the user taps a broken link, there is a "Back to home page" on the error page.

the issue was about that some languages set by the browser do not work when just https://delta.chat (without language or path in the url) was opened.
and this is fixed by redirecting to english.

gerryfrancis commented 1 year ago

users will not enter arbitrary urls of some languages to the browser entry field - and if if they do, it is good practise and expected to return a 404.

@r10s Unfortunately this is not the case on delta.chat . In Firefox, all you get for https://delta.chat/li is a weird URL ( https://delta.chat/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/de/li/ ) and this:

grafik

The Signal website throws a 404 error instead, which is correct.

r10s commented 1 year ago

yip, it is known that there are weird redirects on 404, see https://github.com/deltachat/deltachat-pages/issues/618#issuecomment-1437284129

but that is a minor issue as it does not happen by "normal" usage of the website, you have to enter sth "wrong" manually or tap a broken link to trigger the issue