Open guigeek38 opened 3 years ago
Let me look into this again... I sware it fixes adn comes back
There may be a mirror hardcoded to a protocol, or redirecting rather our tool if so we may have to remove those mirrors from rotation
Jep, known and expected, still, sadly. I just reloaded the uWSGI server and now it should work. But after some hours it will again fail. As discussed earlier in related issues and discord, it seems to be some response caching. Since Armbian ships images with plain HTTP in APT sources, nearly all answers will contain the plain HTTP URL, and after a while the cache (presumably) serves the same plain HTTP URLs as well when the request actually was via HTTPS.
I didn't find time to look into the used Docker container, but since it is a 4 stage request handling (host Nginx => container Nginx => uWSGI => flask), all of them need to be checked for possible caches. The logic in the flask app is definitely correct, which is proven by the fact that after a reload it works as expected for some hours.
ahh okay.. sounds like I need to run this app in a different base container... aka... no nginx.. just uwsgi... or gnuicorn.
that will simplify the first part..and make it user to tune the appserver layer
If I remember right, the used Docker container has Nginx right included: https://github.com/tiangolo/uwsgi-nginx-flask-docker
The other option would be to forward/map ports 80/443 right via Docker instead of via dedicated Nginx proxy on the host. Like -p 80:25090 -p 443:25090
? But indeed an Nginx on the host can be easier configured and monitored, so an alternative would be to expose the uWSGI socket on the host and have the hosts Nginx with:
location / {
include uwsgi_params;
uwsgi_pass unix:///path/to/uwsgi.sock;
}
The container's Nginx would then basically be unused, probably its startup can be prevented when starting the container.
EDIT: Ah sorry your idea was to use different base container. The above could be tests whether skipping the additional Nginx helps with the issue, but indeed a container which doesn't ship the Nginx proxy in the first place would be cleaner π.
# curl https://apt.armbian.com
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="http://armbian.hosthatch.com/apt/">http://armbian.hosthatch.com/apt/</a>. If not click the link.
that was seconds ago. not behind a proxy or anything else that could interfere, it's the webserver at apt.armbian.com
not paying attention to the URL scheme that is presented when redirecting.
i've done it a number of times now today, there is currently seemingly no redirect entry that correctly redirects from https to https.
here are the ones i found in my backlog
Redirection from https to 'http://xogium.performanceservers.nl/apt/dists/bullseye/Release' is forbidden
Redirection from https to 'http://armbian.hosthatch.com/apt/dists/bullseye/Release' is forbidden
i've seen at least 2 more but i can't find them in the backlog, sorry. the [IP: 173.45.128.12 443]
part of the error is always the same. last time i've seen it work OK was about a week ago, didnt check since.
my interim solution is to just use https://armbian.hosthatch.com/apt/
or the other URL directly. for example, put this into your /etc/apt/sources.list.d/armbian.list
:
#deb https://apt.armbian.com bullseye main bullseye-utils bullseye-desktop
deb https://armbian.hosthatch.com/apt bullseye main bullseye-utils bullseye-desktop
N.B. in my armbian.sources.list
file - the URL that points to armbian.com doesn't have the /apt/
subdir, thats added in the redirect, too.
@BloodyNora Please try again now (soon), to have another indication that caching is the issue. I reloaded the uWSGI server and expect it to work now for an hour or less.
@MichaIng can confirm it works as of now. thank you very much!
# cat /etc/apt/sources.list.d/armbian.list
deb https://apt.armbian.com bullseye main bullseye-utils bullseye-desktop
#deb https://armbian.hosthatch.com/apt bullseye main bullseye-utils bullseye-desktop
# apt update
Hit:2 https://security.debian.org bullseye-security InRelease
Hit:3 https://deb.debian.org/debian bullseye InRelease
Hit:4 https://deb.debian.org/debian bullseye-updates InRelease
Hit:5 https://deb.debian.org/debian bullseye-backports InRelease
Get:1 https://xogium.performanceservers.nl/apt bullseye InRelease [18.5 kB]
Get:6 https://xogium.performanceservers.nl/apt bullseye/main armhf Packages [169 kB]
[...]
No need to thank me, it will break itself soon π . But it's good to have this once verified by someone else. I'll have another look into the container tomorrow to see whether I can find any caching instance/setting.
No need to thank me, it will break itself soon sweat_smile. But it's good to have this once verified by someone else. I'll have another look into the container tomorrow to see whether I can find any caching instance/setting.
Hey @MichaIng I swapped out the container runtime to https://hub.docker.com/r/tiangolo/meinheld-gunicorn/
Seems to be behaving... I also noticed some lookups were failing when sites were sending proxy headers. causing a list of IPs to be sent to the redirector.... so i fixed that in nginx as well
since no longer using uwsgi. need to change how the reload function
something like os.kill(os.ppid(), SIGHUP)
Great, I'll test it a few times throughout the day to assure the caching issue is gone.
since no longer using uwsgi. need to change how the reload function
When doing so, maybe add some kind of authentication of access limit π.
EDIT: Hmm, sadly it doesn't work again:
# curl -I 'https://apt.armbian.com/'
HTTP/2 302
server: nginx/1.18.0 (Ubuntu)
date: Sun, 28 Nov 2021 13:50:38 GMT
content-type: text/html; charset=utf-8
content-length: 280
location: http://armbian.systemonachip.net/apt/
While the client request scheme and IP is detected correctly (same as before):
# curl -I 'https://apt.armbian.com/status'
HTTP/2 200
server: nginx/1.18.0 (Ubuntu)
date: Sun, 28 Nov 2021 14:01:16 GMT
content-type: text/html; charset=utf-8
content-length: 2
x-client-ip: <myIPv6>
x-request-scheme: https
# curl -I 'http://apt.armbian.com/status'
HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
Date: Sun, 28 Nov 2021 14:04:42 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 2
Connection: keep-alive
X-Client-IP: <myIPv6>
X-Request-Scheme: http
If you switched out the container, maybe then the host Nginx is the issue? I know it is not great to experiment with a live production server, but could you try to stop the hosts Nginx and expose the containers ports at 80+443 directly on the host? Of course this would mean that it handles HTTPS by itself, not sure if it can be configured to do so and then starting it with -p 80:80 -p 443:443
?
Or it is flask itself which starts response caching after hitting a certain amount of those, sadly ignoring the response headers? For debugging it could be also beneficial to add the X-Request-Scheme
response header as well to regular redirects, so see whether it is then set wrong as well, while /status
keeps responding with the correct header.
Currently it won't work so that's some defense.
I added a reload command to the systemd unit for the daemon that sends HUP signal and it seems to reload properly.
I should just deprecate function
On Sun, Nov 28, 2021, 8:47 AM MichaIng @.***> wrote:
Great, I'll test it a few times throughout the day to assure the caching issue is gone.
since no longer using uwsgi. need to change how the reload function
When doing so, maybe add some kind of authentication of access limit π.
β You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/armbian/dl-router/issues/22#issuecomment-981088421, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEVNQAA6UBEECYQYDXY7NDUOIXETANCNFSM5H64TCVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Makes sense, as long as you have direct server access, no need for such an API.
Btw it now works here. Did you reload/restart the service/container/server recently?
I opened a PR with a quick debugging code: #23
If it again stops preserving HTTPS, adding this code may help to check whether the Location
header only is cached or whether it makes a difference when manipulating the response explicitly by adding another header, and we can see whether the two get_scheme()
calls always return the same (after a while false) result or whether the one within get_redirect()
is what is cached, or the whole get_redirect()
output somehow. Since I have no idea at what level this may or may not (host webserver, container networking, unicorn, flask, Python itself) this may at least narrow it down a little.
before I merge that.... want to test what i just rolled out
now running single worker process with single thread... (it still handled 500 req/sec no problem)
If the issue re-occurs we'll know to keep troubleshooting flask and python. This will eliminate all variables around multiple instances of the script running
monitoring with this
#!/bin/bash
result=0
while [[ ${result} -eq 0 ]];
do
date
curl -Ls -o /dev/null -w %{url_effective\} https://apt.armbian.com | grep -q https
result=$?
sleep 30s
done
pushover -m 'she failed us'
Yes definitely makes sense to verify that the issue is still present. Also probably it's better to copy&paste the little code change manually instead of merging this into production branch, as it is meant for debugging only, and if wanted, could be implemented a more efficient way, I think.
EDIT: Ah, hehe here now I get http://
redirects again.
Yeah should just make it a branch
On Sun, Nov 28, 2021, 12:17 PM MichaIng @.***> wrote:
Yes definitely makes sense to verify that the issue is still present. Also probably it's better to copy&paste the little code change manually instead of merging this into production branch, as it is meant for debugging only, and if wanted, could be implemented a more efficient way, I think.
β You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/armbian/dl-router/issues/22#issuecomment-981120530, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEVNQH6TNF7CUCPITYA7GLUOJP3JANCNFSM5H64TCVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I lost my monitoring task :( anyway yeah it's back.. sooo it doesnt' seem to be a threading issue.. so time to look closer at the code.
i've reloaded for now
FYI, it's already gone again.
since this thread is getting longer and longer now and nobody ever reads a full thread, please see here for an interim solution: https://github.com/armbian/dl-router/issues/22#issuecomment-980802674
Okay I think I fixed it with #25
Also reverted #23
Sadly not π’:
# curl -I https://apt.armbian.com
HTTP/2 302
server: nginx/1.18.0 (Ubuntu)
date: Sun, 05 Dec 2021 14:13:32 GMT
content-type: text/html; charset=utf-8
content-length: 258
location: http://armbian.12z.eu/apt/
With the header being set, did you see it failing as well? I was thinking whether adding the header may have an effect on whichever cache is still kicking in here. At least yesterday, after you deployed the PR with the header, it worked. I just didn't have a chance to test again some hours later.
Blerg yeah I saw this morning
On Sun, Dec 5, 2021, 9:14 AM MichaIng @.***> wrote:
Sadly not π’:
curl -I https://apt.armbian.com
HTTP/2 302
server: nginx/1.18.0 (Ubuntu)
date: Sun, 05 Dec 2021 14:13:32 GMT
content-type: text/html; charset=utf-8
content-length: 258
location: http://armbian.12z.eu/apt/
β You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/armbian/dl-router/issues/22#issuecomment-986237571, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEVNQD3DDOHW56A5CL3K5DUPNXSLANCNFSM5H64TCVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I tried to find info about Flask caching, but while there is a dedicated Flask cache module and of course response headers for browser caching can be set, this all does not apply here. Could you re-add the scheme response header, just to see whether it keeps matching the redirect scheme or not? And if there is any caching layer setting up different cached instances based on it, it may even help. But actually I'm pretty out of ideas at which layer this caching is done, and whether "caching" is even the right word to describe it π.
Shall we try to get some external help, on Super User, Stack Overflow, Flask repository or so?
Yeah I'll re-add. But earlier when the bug occurred.. the header showed the correct scheme despite returning the wrong result.
On Sun, Dec 5, 2021, 12:52 PM MichaIng @.***> wrote:
I tried to find info about Flask caching, but while there is a dedicated Flask cache module and of course response headers for browser caching can be set, this all does not apply here. Could you re-add the scheme response header, just to see whether it keeps matching the redirect scheme or not? And if there is any caching layer setting up different caching instances based on it, it may even help. But actually I'm pretty out of ideas at which layer this caching is done, and whether "caching" is even the right word to describe it π.
Shall we try to get some external help, on Super User https://superuser.com/, Stack Overflow https://stackoverflow.com/, Flask repository https://github.com/pallets/flask/issues or so?
β You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/armbian/dl-router/issues/22#issuecomment-986272644, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEVNQBQ5LZYLKIFIT7Z75TUPORGZANCNFSM5H64TCVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
But earlier when the bug occurred.. the header showed the correct scheme despite returning the wrong result.
Was this before or after you moved the get_scheme()
call into the main route? Reasonable step when resp.headers['X-Request-Scheme'] = get_scheme()
lead to a different scheme shown in the header than what get_redirect()
obviously got internally. Would be interesting to see whether the same variable derived from the same single get_scheme()
call applied once as header and then passed to get_redirect()
still can lead to non-matching schemes in the response. That would be very strange. Only left would then be to verify that mirror_url = mirror_class.next(region)
keeps getting mirrors without scheme or whether for magical reasons it suddenly returns mirrors with schemes, so that the client request scheme is simply not used π.
This was when I had you're debugging from #23 and before I added my changes
On Sun, Dec 5, 2021, 3:16 PM MichaIng @.***> wrote:
But earlier when the bug occurred.. the header showed the correct scheme despite returning the wrong result.
Was this before or after you moved the get_scheme() call into the main route? Reasonable step when resp.headers['X-Request-Scheme'] = get_scheme() lead to a different scheme shown in the header header than what get_redirect() obviously got internally. Would be interesting to see whether the same variable derived from the same single get_redirect() call applied once as header and then passed to get_redirect() still can lead to different schemes in the response. That would be very strange. Only left would then be to verify that mirror_url = mirror_class.next(region) keeps getting mirrors without scheme or whether if for magical reasons suddenly contains schemes, so that the client request scheme is simply not used π.
β You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/armbian/dl-router/issues/22#issuecomment-986293093, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEVNQBJTG3E3OXDFJXDRVDUPPCCTANCNFSM5H64TCVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Makes sense. Let's see what effect both changes have.
https://github.com/armbian/dl-router/blob/master/app/main.py#L100
i feel like we're hitting an edge case here. im wondering if get_redirect isn't returning a scheme at all, and flask or the server is nice enough to add http to make it a valid redirect
@MichaIng okay running on this now https://github.com/armbian/dl-router/tree/pass_scheme
A mismatch indeed/still:
# curl -I https://apt.armbian.com
HTTP/2 302
server: nginx/1.18.0 (Ubuntu)
date: Mon, 06 Dec 2021 11:58:40 GMT
content-type: text/html; charset=utf-8
content-length: 264
location: http://mirror.armbian.de/apt/
x-request-scheme: https
So we have two get_scheme
right after another but either giving different results or more likely get_redirect
doesn't really respect it after a while. Last idea to show this more explicity would be:
scheme = get_scheme()
resp = redirect(get_redirect(path, get_ip(), scheme), 302)
resp.headers['X-Request-Scheme'] = scheme
return resp
I do not believe that mirror_url.find('://', 3, 8)
or mirror_url.count('://', 2, 8)
are the issue or do make any difference, since both works for some hours reliably and the logic is simple and clear. But I have no other explanation either π€.
Yeah I was just grasping at straws. Looks like it made it 8 hours before failing lol
On Mon, Dec 6, 2021, 7:08 AM MichaIng @.***> wrote:
A mismatch indeed:
curl -I https://apt.armbian.com
HTTP/2 302
server: nginx/1.18.0 (Ubuntu)
date: Mon, 06 Dec 2021 11:58:40 GMT
content-type: text/html; charset=utf-8
content-length: 264
location: http://mirror.armbian.de/apt/
x-request-scheme: https
So we have two get_scheme right after another but either giving different results or more likely get_redirect doesn't really respect it. Last idea to show this more explicit would be:
scheme = get_scheme() resp = redirect(get_redirect(path, get_ip(), scheme), 302) resp.headers['X-Request-Scheme'] = scheme return resp
I do not believe that mirror_url.find('://', 3, 8) or mirror_url.count('://', 2, 8) were the issue or do make any difference, since both works for some hours reliably and the logic is simple and clear. But I have no other explanation either π€.
β You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/armbian/dl-router/issues/22#issuecomment-986715037, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEVNQF5BZU7BT2QK7AKK3DUPSRR3ANCNFSM5H64TCVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
gonna be busy for rest of week.. i've made a bandaid script that polls every 30 minutes, and if it returns the wrong result it reloads.... and also sends me a notification π¬
After scratching surface with unit tests and switching to urlsplit per advise of a friend, I don't think get_redirect() is directly the culprit
My apt source list on Armbian uses the HTTPS version of apt.armbian.com. However, I am sometimes redirected to a mirror using HTTP, which APT does not allow, and then I get the error:
I saw this PR which should correct the problem, however it doesn't seem to work: https://github.com/armbian/dl-router/pull/14
To check I simply make requests with Curl on https://apt.armbian.com and get redirects either in https or in http :
Is there something that I misunderstood, or is there a problem? Thank you for your help !