Open bwrsandman opened 7 months ago
Additionally on the Truenas Docker I get this if I set a predefined server
root@flexo-ix-chart-fccd6b8b5-pjg8f:/# FLEXO_MIRROR_SELECTION_METHOD=predefined FLEXO_MIRRORS_PREDEFINED="['https://mirrors.kernel.org/archlinux']" FLEXO_CONNECT_TIMEOUT=50000 flexo
[2024-01-20T19:25:38.491Z DEBUG flexo] The following settings were fetched from the TOML file or environment variables: MirrorConfig {
cache_directory: "/var/cache/flexo/pkg",
mirrorlist_fallback_file: "/var/cache/flexo/state/mirrorlist",
mirrorlist_latency_test_results_file: Some(
"/var/cache/flexo/state/latency_test_results.json",
),
refresh_latency_tests_after: None,
port: 47878,
listen_ip_address: Some(
"0.0.0.0",
),
mirror_selection_method: Predefined,
mirrors_predefined: [
"https://mirrors.kernel.org/archlinux",
],
custom_repo: None,
low_speed_limit: Some(
128000,
),
low_speed_limit_formatted: None,
low_speed_time_secs: Some(
3,
),
connect_timeout: Some(
50000,
),
max_speed_limit: None,
num_versions_retain: Some(
3,
),
mirrors_auto: None,
}
[2024-01-20T19:25:41.552Z DEBUG flexo] Attempt to establish new connection
[2024-01-20T19:25:41.552Z DEBUG flexo] Establish a new connection to https://mirrors.kernel.org/archlinux
[2024-01-20T19:25:41.552Z DEBUG flexo::mirror_flexo] Fetch package from remote mirror: https://mirrors.kernel.org/archlinux/extra/os/x86_64/extra.db.
[2024-01-20T19:25:41.552Z DEBUG flexo::mirror_flexo] Set low_speed_time to 3 seconds.
[2024-01-20T19:25:41.552Z DEBUG flexo::mirror_flexo] No speed limit was set.
[2024-01-20T19:25:41.552Z INFO flexo::mirror_flexo] File is still empty: extra.db-4c74feba-b487-4226-b248-8725366de239
[2024-01-20T19:25:41.552Z DEBUG flexo::mirror_flexo] Start download from https://mirrors.kernel.org/archlinux
[2024-01-20T19:25:41.552Z DEBUG flexo::mirror_flexo] Attempt to create file: "/tmp/flexo/uncacheable/multilib/os/x86_64/multilib.db-2707c6b3-8e99-4777-94ff-ee0b89bc1182"
[2024-01-20T19:25:41.553Z DEBUG flexo::mirror_flexo] Fetch package from remote mirror: https://mirrors.kernel.org/archlinux/multilib/os/x86_64/multilib.db.
[2024-01-20T19:25:41.553Z DEBUG flexo::mirror_flexo] Set low_speed_time to 3 seconds.
[2024-01-20T19:25:41.553Z DEBUG flexo::mirror_flexo] No speed limit was set.
[2024-01-20T19:25:41.553Z INFO flexo::mirror_flexo] File is still empty: multilib.db-2707c6b3-8e99-4777-94ff-ee0b89bc1182
[2024-01-20T19:25:41.553Z DEBUG flexo::mirror_flexo] Start download from https://mirrors.kernel.org/archlinux
[2024-01-20T19:25:41.553Z DEBUG flexo] Establish a new connection to https://mirrors.kernel.org/archlinux
[2024-01-20T19:25:41.553Z DEBUG flexo::mirror_flexo] Attempt to create file: "/tmp/flexo/uncacheable/community/os/x86_64/community.db-6997fb06-a335-4b11-880a-c92732e3019d"
[2024-01-20T19:25:41.554Z DEBUG flexo::mirror_flexo] Fetch package from remote mirror: https://mirrors.kernel.org/archlinux/community/os/x86_64/community.db.
[2024-01-20T19:25:41.555Z DEBUG flexo::mirror_flexo] Set low_speed_time to 3 seconds.
[2024-01-20T19:25:41.555Z DEBUG flexo::mirror_flexo] No speed limit was set.
[2024-01-20T19:25:41.555Z INFO flexo::mirror_flexo] File is still empty: community.db-6997fb06-a335-4b11-880a-c92732e3019d
[2024-01-20T19:25:41.555Z DEBUG flexo::mirror_flexo] Start download from https://mirrors.kernel.org/archlinux
[2024-01-20T19:25:46.967Z DEBUG flexo::mirror_flexo] Received complete header from remote mirror
[2024-01-20T19:25:46.967Z INFO flexo::mirror_flexo] HTTP response code is 301
[2024-01-20T19:25:46.967Z DEBUG flexo::mirror_flexo] Received complete header from remote mirror
[2024-01-20T19:25:46.967Z INFO flexo::mirror_flexo] HTTP response code is 301
[2024-01-20T19:25:46.967Z DEBUG flexo::mirror_flexo] Received complete header from remote mirror
[2024-01-20T19:25:46.967Z INFO flexo::mirror_flexo] HTTP response code is 301
[2024-01-20T19:25:46.967Z DEBUG flexo::mirror_flexo] Server sent a redirect: Waiting for next header.
[2024-01-20T19:25:46.967Z DEBUG flexo::mirror_flexo] Server sent a redirect: Waiting for next header.
[2024-01-20T19:25:46.967Z DEBUG flexo::mirror_flexo] Server sent a redirect: Waiting for next header.
[2024-01-20T19:25:46.968Z DEBUG flexo::mirror_flexo] Received complete header from remote mirror
[2024-01-20T19:25:46.968Z INFO flexo::mirror_flexo] HTTP response code is 301
[2024-01-20T19:25:46.968Z DEBUG flexo::mirror_flexo] Server sent a redirect: Waiting for next header.
[2024-01-20T19:25:48.551Z ERROR flexo] Timeout: Unable to obtain content length.
[2024-01-20T19:25:48.551Z DEBUG flexo::http_headers] Sending header to client: "HTTP/1.1 500 Internal Server Error\r\nServer: flexo\r\nDate: Sat, 20 Jan 2024 19:25:48 GMT\r\nFlexo-Payload-Origin: NoPayload\r\nContent-Length: 0\r\n\r\n"
[2024-01-20T19:25:48.551Z INFO flexo] Request served [NO PAYLOAD]: "core/os/x86_64/core.db"
[2024-01-20T19:25:48.551Z DEBUG flexo] Reading header from client.
[2024-01-20T19:25:48.552Z ERROR flexo] Timeout: Unable to obtain content length.
[2024-01-20T19:25:48.552Z ERROR flexo] Timeout: Unable to obtain content length.
[2024-01-20T19:25:48.552Z DEBUG flexo::http_headers] Sending header to client: "HTTP/1.1 500 Internal Server Error\r\nServer: flexo\r\nDate: Sat, 20 Jan 2024 19:25:48 GMT\r\nFlexo-Payload-Origin: NoPayload\r\nContent-Length: 0\r\n\r\n"
[2024-01-20T19:25:48.552Z DEBUG flexo::http_headers] Sending header to client: "HTTP/1.1 500 Internal Server Error\r\nServer: flexo\r\nDate: Sat, 20 Jan 2024 19:25:48 GMT\r\nFlexo-Payload-Origin: NoPayload\r\nContent-Length: 0\r\n\r\n"
My first guess would be that the mirror selection does not work well for you and flexo is therefore choosing mirrors with a too high latency.
You don't have anything unusual configured regarding networking, like a VPN or so? Also, on which continent are you located? In the past, some users from Australia had issues because of the latencies from Australia to mirrors located in Europe and elsewhere.
Could you please try the following:
/var/cache/flexo/state/latency_test_results.json
, if it exists.
Use the default TOML config file and leave all default settings, with the only exceptions:allowed_countries
should be configured as described here.listen_ip_address
, that's also fine, that won't have an influence on the issue you've described.max_score
.If that still doesn't work, please post the output of the following command, would be interesting to see how fast your connection is:
curl -w @- -o /dev/null -s "https://mirror.moson.org/arch/core/os/x86_64/core.db" <<'EOF'
time_namelookup: %{time_namelookup}\n
time_connect: %{time_connect}\n
time_appconnect: %{time_appconnect}\n
time_pretransfer: %{time_pretransfer}\n
time_redirect: %{time_redirect}\n
time_starttransfer: %{time_starttransfer}\n
----------\n
time_total: %{time_total}\n
EOF
Trying this on my laptop and it works file.
Trying this on my server which is a TrueNAS Scale x86_64 which uses kubernetes.
When I run in the container, unset all the FLEXO
env variables, set the allowed_countries
and use the default TOML, I still get time outs with flexo but using curl downloads fairly quickly even with some which timeout with flexo.
Flexo also exits because it can't find any non timed out mirror. Increasing max score to 10 and timeout to 30 seconds doesn't change anything.
Here's the output with moson.org
time_namelookup: 5.059352
time_connect: 5.172486
time_appconnect: 5.300360
time_pretransfer: 5.300383
time_redirect: 0.000000
time_starttransfer: 5.505811
----------
time_total: 5.707949
Here's the output with moson.org
time_namelookup: 5.059352 time_connect: 5.172486 time_appconnect: 5.300360 time_pretransfer: 5.300383 time_redirect: 0.000000 time_starttransfer: 5.505811 ---------- time_total: 5.707949
So it takes more than 5 seconds until name lookup has completed, that is very unusual. This is how it looks for me:
time_namelookup: 0.001474
time_connect: 0.009997
time_appconnect: 0.020877
time_pretransfer: 0.020898
time_redirect: 0.000000
time_starttransfer: 0.036836
----------
time_total: 0.055268
It's possible that flexo does not behave quite as expected under less-than-ideal networking conditions, but my impression is that some networking strangeness in your Kubernetes setup seems to be at least a contributing factor.
Have you verified that the settings are also applied correctly for Flexo? I'm not an expert on Kubernetes, but with plain docker, here's one example to set the allowed_countries
setting via environment vars:
docker container run -it --rm -e RUST_LOG=debug -e FLEXO_MIRRORS_AUTO_ALLOWED_COUNTRIES=DE,AT,CZ nroi/flexo
During startup, you should then see something like the following to show which settings are applied (this requires a log level of DEBUG or comparable):
[2024-02-05T17:24:18.005Z DEBUG flexo] The following settings were fetched from the TOML file or environment variables: MirrorConfig {
cache_directory: "/var/cache/flexo/pkg",
mirrorlist_fallback_file: "/var/cache/flexo/state/mirrorlist",
mirrorlist_latency_test_results_file: Some(
"/var/cache/flexo/state/latency_test_results.json",
),
refresh_latency_tests_after: None,
port: 7878,
listen_ip_address: Some(
"0.0.0.0",
),
mirror_selection_method: Auto,
mirrors_predefined: [],
custom_repo: None,
low_speed_limit: Some(
128000,
),
low_speed_limit_formatted: None,
low_speed_time_secs: Some(
3,
),
connect_timeout: None,
max_speed_limit: None,
num_versions_retain: Some(
3,
),
mirrors_auto: Some(
MirrorsAutoConfig {
mirrors_status_json_endpoint: "https://archlinux.org/mirrors/status/json/",
mirrors_status_json_endpoint_fallbacks: [],
mirrors_blacklist: [],
https_required: true,
ipv4: true,
ipv6: false,
max_score: 2.5,
num_mirrors: 8,
mirrors_random_or_sort: Sort,
timeout: 350,
allowed_countries: [
"DE",
"AT",
"CZ",
],
},
),
}
What you say makes sense, the time takes less long on other devices.
On the truenas itself (outside of the container) I can bring the time_namelookup
down to less than a millisecond.
I can also get very good times in other containers.
Is there something special done to the DNS in the nroi/flexo
container?
You can see the settings i ran in my original post.
Using the dig
command I was able to find out that the first of my nameservers was an old pihole.
It seems like 5 seconds is the re-try time until it selected my second nameserver which responds in miliseconds.
Changing my DNS settings fixed the issue and flexo is working in the container.
Don't know what we can learn from this to improve flexo since these issues only came up from a bad DNS config.
Changing my DNS settings fixed the issue and flexo is working in the container.
Glad to hear that!
Don't know what we can learn from this to improve flexo since these issues only came up from a bad DNS config.
I might try to check if I can reproduce this issue somehow, but as long as it works fine for users with no unusual network settings, I won't invest too much time into it. There are other areas at flexo that have higher priority.
Describe the bug When running the flexo docker, no mirrors are reachable and they time out. Increasing the time out duration does nothing. Running on two different machines (Arch Linux x86_64, TrueNAS Scale cobia x86_64) gives the same result: Locally, the mirrors are reachable, but in the docker they time out.
Starting up an arch linux docker and installing flexo from the AUR results in timeouts as well.
Running curl inside of the docker works
Installation method Docker
Distribution linux/amd64
Version digest e9f3ba02b33a
Log Running
docker run -it --rm -e RUST_LOG=trace nroi/flexo:latest