Closed NTohan closed 1 month ago
Hello, Thanks again for feedback. The Docker Compose setup is only meant for demo / evaluation purposes only and should not be used for production environments. When going to production I highly suggest using Kubernetes with our helm chart https://github.com/mendersoftware/mender-helm.
If you still want to use the docker compose setup, https://github.com/mendersoftware/mender-server/pull/110 introduces a (self-signed) demo certificate. You can use that as a basis and issue your own public certificate and mount it (for example by replacing the compose/certs/mender.crt
and compose/certs/mender.key
with your certificate and key respectively).
Another option, if you want to use compose behind a reverse proxy handling the TLS termination. You can simply disable the websecure
entrypoint and the redirection as follows:
diff --git a/docker-compose.yml b/docker-compose.yml
index b363769..e597262 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -330,13 +330,13 @@ services:
- --api.insecure=true
- --accesslog=true
- --entrypoints.web.address=:80
- - --entrypoints.web.http.redirections.entryPoint.scheme=https
- - --entrypoints.web.http.redirections.entryPoint.to=websecure
- - --entrypoints.websecure.address=:443
- - --entrypoints.websecure.transport.respondingTimeouts.idleTimeout=7200
- - --entrypoints.websecure.transport.respondingTimeouts.readTimeout=7200
- - --entrypoints.websecure.transport.respondingTimeouts.writeTimeout=7200
- - --entrypoints.websecure.http.tls=true
This way, the mender-server is exposed in plain HTTP (no TLS) on port 80.
Thank you for your quick response and suggestions. I came across the helm suggestion earlier too, but simply due to limited resources on my server, I went with docker compose
. Nevertheless, if you are suggesting that docker compose
is/will not be maintained actively like helm, I would seriously consider upgrading my server.
Unfortunately, your suggestion with exposing to plain HTTP did not work for me, I keep getting 404 page not found
with http://192.168.x.x:80
diff --git a/docker-compose.yml b/docker-compose.yml
index b363769..c9fdd05 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -330,13 +330,13 @@ services:
- --api.insecure=true
- --accesslog=true
- --entrypoints.web.address=:80
- - --entrypoints.web.http.redirections.entryPoint.scheme=https
- - --entrypoints.web.http.redirections.entryPoint.to=websecure
- - --entrypoints.websecure.address=:443
- - --entrypoints.websecure.transport.respondingTimeouts.idleTimeout=7200
- - --entrypoints.websecure.transport.respondingTimeouts.readTimeout=7200
- - --entrypoints.websecure.transport.respondingTimeouts.writeTimeout=7200
- - --entrypoints.websecure.http.tls=true
+ #- --entrypoints.web.http.redirections.entryPoint.scheme=https
+ #- --entrypoints.web.http.redirections.entryPoint.to=websecure
+ #- --entrypoints.websecure.address=:443
+ #- --entrypoints.websecure.transport.respondingTimeouts.idleTimeout=7200
+ #- --entrypoints.websecure.transport.respondingTimeouts.readTimeout=7200
+ #- --entrypoints.websecure.transport.respondingTimeouts.writeTimeout=7200
+ #- --entrypoints.websecure.http.tls=true
- --providers.file.directory=/etc/traefik/config
- --providers.docker=true
- --providers.docker.exposedByDefault=false
Also, I will try generating certificates for extra layer of security but behind reverse proxy it is more or less redundant.
Sorry I was too quick to apply, not actually testing my suggestion. I identified the problem and addressed it in the PR (https://github.com/mendersoftware/mender-server/pull/110/commits/cf0b9b2175e9cf99c320e623b897f0c6a57137ca). If you now try to remove the websecure
entrypoint from Traefik (as I described in my previous comment) it will serve plain HTTP on port 80.
I came across the helm suggestion earlier too, but simply due to limited resources on my server, I went with docker compose. Nevertheless, if you are suggesting that docker compose is/will not be maintained actively like helm, I would seriously consider upgrading my server.
As long as it's not for a production environment. For small deployments you should be fine using this docker compose, but you might run into trouble if you need to scale the number of replicas, or load balance the application across machines. We will continue to use and improve the docker compose environment as we are also using it internally when doing system integration testing.
Also, I will try generating certificates for extra layer of security but behind reverse proxy it is more or less redundant.
As long as your ingress (proxy) is using TLS and the docker composition is not exposed outside your secure LAN you should be ok, but be aware that port 80 will be exposed to your local network.
Thank you for your fix. I am happy to confirm that with your changes and removing the websecure
entrypoint I am able to connect to port 80. It is possible to configure the default port from 80 to something else?
The only changes I have noticed that inventory information under a registered device is empty now. Also, column configuration options under table configuration are reduced to very limited. Could this be related to the recent changes? Not a blocker for me though.
Yes, you are right about the scaling and load balancing features K8s has to offer. I will definitely consider it for scaling.
Regarding TLS, it is enabled by default for reverse proxy in-use.
Thank you for your fix. I am happy to confirm that with your changes and removing the
websecure
entrypoint I am able to connect to port 80. It is possible to configure the default port from 80 to something else?
Unfortunately, the deployments are constantly failing with docker compose
at https://github.com/mendersoftware/mender-server/commit/cf0b9b2175e9cf99c320e623b897f0c6a57137ca and removing the websecure entrypoint
2024-10-19 20:58:30.714 +0000 UTC warning: Host not found (non-authoritative), try again later: GET https://s3.mender.local/mender/1392ab9b-95f7-43f4-b3f6-a12ed1f94ad2?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=mender%2F20241019%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241019T205829Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3D%22hello-world-container-update.mender%22&response-content-type=application%2Fvnd.mender-artifact&x-id=GetObject&X-Amz-Signature=9d1aa76146d3063870369e64ba1c99580d292a8a5c2c5684ce0d34512a992655:
Please find the complete logs in the file: deployment-log-80fcbbce-5e16-4c3d-a77b-558031d77b3e-hello-world-container-update-2024-10-19T21_08_32.096Z.log
Additional information:
mender-client
is installed on the target device using:
wget -O- https://get.mender.io | sudo bash -s -- --demo --force-mender-client4 -- --quiet --device-type "genericx86-64" --demo --server-url https://devices.domain.xxx --server-cert=""
mender-client
is configured with the following parameters:
$ cat /etc/mender/mender.conf
{
"HttpsClient": {},
"Security": {},
"Connectivity": {},
"DeviceTypeFile": "/var/lib/mender/device_type",
"UpdateControlMapExpirationTimeSeconds": 90,
"UpdateControlMapBootExpirationTimeSeconds": 45,
"UpdatePollIntervalSeconds": 5,
"InventoryPollIntervalSeconds": 5,
"RetryPollIntervalSeconds": 30,
"Servers": [
{
"ServerURL": "https://devices.domain.xxx"
}
]
}
$ cat /var/lib/mender/device_type
device_type=genericx86-64
Also, I have tried re-installing the mender-client
on the target device and re-created the deployment but unfortunately the deployments are still failing.
Can you please check the issue and confirm if it is an issue within the under-development docker-compose
or have I configured the mender-server
not correctly on my part?
Thank you for your efforts.
Sorry, I forgot about the s3 bucket configuration. At this point, I think these customization deserve a docker compose override file to make it easier to explain. I can see two options for making this work.
# docker-compose.http.yml
# docker compose -f docker-compose.yml -f docker-compose.http.yml up -d
services:
traefik:
command:
- --api=true
- --api.insecure=true
- --accesslog=true
- --entrypoints.web.address=:80
- --providers.file.directory=/etc/traefik/config
- --providers.docker=true
- --providers.docker.exposedByDefault=false
deployments:
environment:
DEPLOYMENTS_PRESIGN_URL_HOSTNAME: "<your gateway domain name>"
DEPLOYMENTS_PRESIGN_SECRET: "<Generate a random base64 secret, for example: head -c 16 /dev/urandom | base64 -w 0 >"
DEPLOYMENTS_STORAGE_BUCKET: "<BUCKET_NAME>"
DEPLOYMENTS_AWS_URI: "<https://BUCKET_NAME.AWS_REGION.amazonaws.com>"
DEPLOYMENTS_AWS_EXTERNAL_URI: "<https://BUCKET_NAME.AWS_REGION.amazonaws.com>"
DEPLOYMENTS_AWS_AUTH_KEY: "${AWS_ACCESS_KEY_ID}"
DEPLOYMENTS_AWS_AUTH_SECRET: "${AWS_SECRET_ACCESS_KEY}"
s3fs:
scale: 0
Where DEPLOYMENTS_AWS_AUTH_KEY
and DEPLOYMENTS_AWS_AUTH_SECRET
is set to your secret access key for the s3 bucket.
Create a rule in the reverse proxy for forwarding to (with hostname rewrite) s3.mender.local
# docker compose -f docker-compose.yml -f docker-compose.http.yml up -d
services:
traefik:
command:
- --api=true
- --api.insecure=true
- --accesslog=true
- --entrypoints.web.address=:80
- --providers.file.directory=/etc/traefik/config
- --providers.docker=true
- --providers.docker.exposedByDefault=false
deployments:
environment:
DEPLOYMENTS_PRESIGN_URL_HOSTNAME: "<your gateway domain name>"
DEPLOYMENTS_PRESIGN_SECRET: "<Generate a random base64 secret, for example: head -c 16 /dev/urandom | base64 -w 0 >"
DEPLOYMENTS_AWS_PROXY_URI: "https://<your domain>/mender" # Setup a rule that maps `/mender` to `s3.mender.local` with host header rewrite.
[!IMPORTANT] If you go with option 2, make sure you using my latest version of the PR (https://github.com/mendersoftware/mender-server/pull/110/commits/bde5c31a7ca8e153683a207b599f6ddbb5cefd5f) and set
MENDER_SECRET_ACCESS_KEY
environment variable to a secret value as the s3 storage would be exposed with an insecure access key.
It is possible to configure the default port from 80 to something else?
Yes, simply change the port number for the web
service in the traefik service, for example:
diff --git a/docker-compose.yml b/docker-compose.yml
index b363769..c9fdd05 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -330,13 +330,13 @@ services:
- --api.insecure=true
- --accesslog=true
- - --entrypoints.web.address=:80
+ - --entrypoints.web.address=:8080
Thank you for your patch and summary on possible options. I like your approach with docker compose override file.
I have tried the option 2 and Sorry to report that deployment is not working for me.
Here are the steps Step 1: Your latest changes from https://github.com/mendersoftware/mender-server/commit/bde5c31a7ca8e153683a207b599f6ddbb5cefd5f
$ git log
commit cf0b9b2175e9cf99c320e623b897f0c6a57137ca (HEAD)
Author: Alf-Rune Siqveland <alf.rune@northern.tech>
Date: Fri Oct 18 15:07:22 2024 +0200
chore(docker): Remove hard-coded entrypoint in routes and define default
Signed-off-by: Alf-Rune Siqveland <alf.rune@northern.tech>
Step 2: Setup a rule that maps /mender
to s3.mender.local
Step 3: Add a new docker compose override file docker-compose.http.yml
$ git diff
$ export MENDER_SECRET_ACCESS_KEY=generated_random_key
$ cat docker-compose.http.yml
services:
traefik:
command:
- --api=true
- --api.insecure=true
- --accesslog=true
- --entrypoints.web.address=:80
- --providers.file.directory=/etc/traefik/config
- --providers.docker=true
- --providers.docker.exposedByDefault=false
deployments:
environment:
DEPLOYMENTS_PRESIGN_URL_HOSTNAME: "devices.domain.xxx"
DEPLOYMENTS_PRESIGN_SECRET: "generated_random_key" #"<Generate a random base64 secret, for example: head -c 16 /dev/urandom | base64 -w 0 >"
DEPLOYMENTS_AWS_PROXY_URI: "https://domain.xxx/mender" # Setup a rule that maps `/mender` to `s3.mender.local` with host header rewrite.
$ docker compose -f docker-compose.yml -f docker-compose.http.yml up --build
Step 4: Try a demo deployment
Unfortunately, I keep getting this error on mender-server
Couldn't load deployments. Cannot read properties of undefined (reading 'length') Retrying in 9 seconds...
Please find the logs from _mender-deployments-1_logs (1).txt
Also, how to make sure that s3.mender.local
is setup properly on my server? Is it possible to bypass the local domain s3.mender.local
and replace it with something like http://<local_ip>:<port>
within docker compose
? This might be helpful to test if reverse-proxy is able to point to http://<local_ip>:<port>
and has issue resolving to the suggested s3.mender.local
.
Yes, simply change the port number for the
web
service in the traefik service, for example:
Thank you for the suggestion. I will try adapting the exposure port after deployments are functional.
I merged the PR to main with one notable change: the domain name changed from mender.local
to docker.mender.io
, this is to avoid potential conflicts with mDNS top-level domain (.local
).
Please find the logs from _mender-deployments-1_logs (1).txt
The only thing that sticks out from the logs here is that it seems like your trying to recreate a deployment that is already in progress. The error observed seems to be coming from an unexpected response that is not handled in the frontend. I'm not sure exactly what's causing this, but I will look into it.
To test if your setup works, you could try uploading an artifact and then download it again.
Also, how to make sure that s3.mender.local is setup properly on my server?
It seems like the problem is that s3.mender.local
(now docker.mender.io
on main
) should map to localhost, so you should add a routing entry mapping this hostname back to your localhost. That is, on Linux you need to edit /etc/hosts
: echo "127.0.0.1 s3.mender.local" | sudo tee -a /etc/hosts
. On Windows I believe you need to append the route to C:\Windows\System32\drivers\etc\hosts
. Alternatively, you could setup a local DNS on your LAN, adding A record mapping the domain back to your private IP.
Is it possible to bypass the local domain s3.mender.local and replace it with something like http://
: within docker compose? This might be helpful to test if reverse-proxy is able to point to http:// : and has issue resolving to the suggested s3.mender.local.
This is possible, but I would not recommend it as containers get their IPs reassigned every time you restart the docker compose environment. But for the sake of completeness, you can forward the requests to the IP of the s3fs
container which you can get by running:
docker inspect mender-s3fs-1 --format '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}'
The S3 API is exposed on port 8333
on this IP address.
And one more thing, just to double check. I hope you've replaced the value of the MENDER_SECRET_ACCESS_KEY before running docker compose up -d
. It is important that this is secret as it would otherwise give public access to the blob storage.
It seems like the problem is that
s3.mender.local
(nowdocker.mender.io
onmain
) should map to localhost, so you should add a routing entry mapping this hostname back to your localhost.
Okay, thank you for pointing out the changes merged to main
in the meanwhile. I am now pointing to cd5f6108bb51345d12e67816f13ee1b4507c986c and mapped docker.mender.io
and s3.docker.mender.io
to 127.0.0.1
as mentioned in README.md
and adapted my proxy rule https://domain.xxx/mender
to http://docker.mender.io
.
$ cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 <hostname>
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
127.0.0.1 docker.mender.io s3.docker.mender.io
[!IMPORTANT] Reverse proxy rule for
mender-server
to map default port80
is setup with a subdomaindevices.domain.xxx
, therefore,DEPLOYMENTS_PRESIGN_URL_HOSTNAME
is also mapped todevices.domain.xxx
. Where as the rule for deploymentDEPLOYMENTS_AWS_PROXY_URI
is setup with a pathdomain.xxx/mender
.DEPLOYMENTS_PRESIGN_URL_HOSTNAME: "devices.domain.xxx" DEPLOYMENTS_PRESIGN_SECRET: "generated_random_key" #"<Generate a random base64 secret, for example: head -c 16 /dev/urandom | base64 -w 0 >" DEPLOYMENTS_AWS_PROXY_URI: "https://domain.xxx/mender" # Setup a rule that maps `/mender` to `s3.mender.local` with host header rewrite.
Important note can be also observed in my last comment. I hope that is not the issues because I am not able to upload and download artifacts like you suggested to test.
I have also observed that mender-deployments-1 keeps existing and I needed to start the container manually. Therefore, I have adapted the restart-policy to unless-stopped
. Nevertheless, it seems there are permission issues main: failed to setup storage client: s3: failed to check bucket preconditions: s3: insufficient permissions for accessing bucket 'mender'
time="2024-10-22T09:56:01Z" level=info msg="Deployments Service starting up" caller="main.cmdServer@main.go:159"
time="2024-10-22T09:56:01Z" level=info msg="automigrate is ON, will apply migrations" caller="mongo.Migrate@migrations.go:50"
time="2024-10-22T09:56:01Z" level=info msg="migrating deployment_service" caller="mongo.MigrateSingle@migrations.go:72"
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.1 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.2 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.3 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.4 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.5 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.6 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.7 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.9 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.10 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.11 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.13 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.14 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.15 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.16 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="migration to version 1.2.17 skipped" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:128" db=deployment_service
time="2024-10-22T09:56:01Z" level=info msg="DB migrated to version 1.2.17" caller="migrate.(*SimpleMigrator).Apply@migrator_simple.go:143" db=deployment_service
main: failed to setup storage client: s3: failed to check bucket preconditions: s3: insufficient permissions for accessing bucket 'mender'
Any ideas, what could be the reason for insufficient permissions?
Also, I have observed errors like:
traefik-1 | 2024-10-22T09:56:00Z ERR error="service \"deployments\" error: unable to find the IP address for the container \"/mender-deployments-1\": the server is ignored" container=deployments-mender-b362e1357ad64f71ae79a8430316c357d0ef352eace0d8b75f5ecd221e0b8020 providerName=docker
traefik-1 | 2024-10-22T09:56:00Z ERR error="service \"deployments\" error: unable to find the IP address for the container \"/mender-deployments-1\": the server is ignored" container=deployments-mender-b362e1357ad64f71ae79a8430316c357d0ef352eace0d8b75f5ecd221e0b8020 providerName=docker
And one more thing, just to double check. I hope you've replaced the value of the MENDER_SECRET_ACCESS_KEY before running
docker compose up -d
. It is important that this is secret as it would otherwise give public access to the blob storage.
Yes, I have created a new key for my setup but still thank you for mentioning it as I also see it worth mentioning to avoid public access to artifacts. 👍
Okay, the issue with permission seems to be a race condition and first removing then relaunching all containers seems to fix it.
_mender-s3fs-1_logs.txt _mender-deployments-1_logs (5).txt
However, to your suggested test, I am not able to download the artifacts from mender-server
either. When I click on DOWNLOAD ARTIFACT
, I am being redirected to https://s3.docker.mender.io/mender/d1bae418-85ae-41bf-b7fd-a27bf866d43e?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=mender%2F20241022%2Fus-east-<...>
which should be replaced according to my proxy rule to https://devices.domain.xxx/mender
.
Also, replacing manually https://s3.docker.mender.io/mender/...
to https://devices.domain.xxx/mender/...
leads to {"error": {"status_code": 404,"message": "Not Found"}}
@alfrunes Just to be sure, can you please double check if DEPLOYMENTS_AWS_PROXY_URI
is handled properly internally? Thank you very much in advance.
I made a typo with one of the environment variables in the override. It should be DEPLOYMENTS_STORAGE_PROXY_URI
and not DEPLOYMENTS_AWS_PROXY_URI
.
Sorry about the inconvenience.
Thank you for the correct environment variable name. By using the correct env variables DEPLOYMENTS_STORAGE_PROXY_URI
name the re-routing to my proxy rule is fixed.
https://devices.domain.xxx/mender/b0a73540-e58a-4d28-87b2-823ee810f7f1?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=mender%2F20241022%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241022T205643Z&X-Amz-Expires=900&X-Amz-Signature=c41c5dc48c65625b527a8bf7ef7608148995393a82125306a4274b78b4c8e73f&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B+filename%3D%22hello-world-container-update.mender%22&response-content-type=application%2Fvnd.mender-artifact&x-id=GetObject
Nevertheless, I strongly believe that there is still something wrong with handling of the proxy URI within the services. At least, I can observe in mender-gui-1 logs. Below, you see two attempts to download two different artifacts manually and host name is set correctly host: "devices.domain.xxx"
.
2024/10/22 20:46:59 [error] 8#8: *5 open() "/var/www/mender-gui/dist/mender/a119f7e4-f665-49d1-96f2-92c6038e7521" failed (2: No such file or directory), client: 172.18.0.2, server: , request: "GET /mender/a119f7e4-f665-49d1-96f2-92c6038e7521?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=mender%2F20241022%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241022T204620Z&X-Amz-Expires=900&X-Amz-Signature=d14cdfa1bc00665f9e372b28044b96f827eff18359b2f96da29da1ce97fd97c2&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B+filename%3D%22hello-world-container-update.mender%22&response-content-type=application%2Fvnd.mender-artifact&x-id=GetObject HTTP/1.1", host: "devices.domain.xxx"
172.18.0.2 - - [22/Oct/2024:20:46:59 +0000] "GET /mender/a119f7e4-f665-49d1-96f2-92c6038e7521?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=mender%2F20241022%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241022T204620Z&X-Amz-Expires=900&X-Amz-Signature=d14cdfa1bc00665f9e372b28044b96f827eff18359b2f96da29da1ce97fd97c2&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B+filename%3D%22hello-world-container-update.mender%22&response-content-type=application%2Fvnd.mender-artifact&x-id=GetObject HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"
172.18.0.2 - - [22/Oct/2024:20:46:59 +0000] "GET /404.json HTTP/1.1" 404 54 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"
.....
127.0.0.1 - - [22/Oct/2024:20:49:42 +0000] "GET /ui/ HTTP/1.1" 200 869 "-" "Wget"
2024/10/22 20:49:48 [error] 8#8: *39 open() "/var/www/mender-gui/dist/mender/10b11048-72c6-42ac-ad08-90e896ec8638" failed (2: No such file or directory), client: 172.18.0.2, server: , request: "GET /mender/10b11048-72c6-42ac-ad08-90e896ec8638?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=mender%2F20241022%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241022T204933Z&X-Amz-Expires=900&X-Amz-Signature=c5ab07bd960987e3b077b316dbcbc4a4de633c9d4673228f68e1b66e2947a006&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B+filename%3D%22hello-world-container-update.mender%22&response-content-type=application%2Fvnd.mender-artifact&x-id=GetObject HTTP/1.1", host: "devices.domain.xxx"
172.18.0.2 - - [22/Oct/2024:20:49:48 +0000] "GET /mender/10b11048-72c6-42ac-ad08-90e896ec8638?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=mender%2F20241022%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241022T204933Z&X-Amz-Expires=900&X-Amz-Signature=c5ab07bd960987e3b077b316dbcbc4a4de633c9d4673228f68e1b66e2947a006&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B+filename%3D%22hello-world-container-update.mender%22&response-content-type=application%2Fvnd.mender-artifact&x-id=GetObject HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"
172.18.0.2 - - [22/Oct/2024:20:49:48 +0000] "GET /404.json HTTP/1.1" 404 54 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"
Here is the sample artifact under test : hello-world-container-update.mender.zip.
@alfrunes Can you please check with provided artifact if uploading and downloading behind external proxy like my setup is handled properly? Thank you very much in advance. I am happy to provide you with more logs if needed, please let me know.
I really appreciate your support with all the topics. 👍
It seems like the Traefik routing rule to SeaweedFS (s3fs service) does not fit your reverse proxy setup. The reason why you see these requests ending up in the gui service is because that is the fallback rule (if no other routes apply). It turns out that the routing to the s3 backend is done using the hostname of requests (all requests to a s3.*
subdomain). I created a PR updating this rule to use path prefix instead that you could try: #124.
Thank you for your fix. I ran some quick tests with the latest changes from https://github.com/mendersoftware/mender-server/pull/124 but unfortunately running into the same issue. Please find the logs _mender-gui-1_logs (2).txt
$ git diff
diff --git a/docker-compose.yml b/docker-compose.yml
index 16ae390..2eb6242 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -365,7 +365,7 @@ services:
labels:
traefik.enable: "true"
traefik.http.routers.s3fs.priority: "99999"
- traefik.http.routers.s3fs.rule: HostRegexp(`s3\..*`)
+ traefik.http.routers.s3fs.rule: PathPrefix(`/mender`)
traefik.http.services.s3fs.loadBalancer.server.port: "8333"
command: [server -s3 -s3.config /etc/seaweedfs/s3.conf]
healthcheck:
Ok, this time I think it's a different issue. Your requests are routed correctly (no s3 requests falling back to the gui service). However, I discovered a different issue with the way SeaweedFS is setup in the docker compose setup. In short, the artifact data is not persisted across container recreations, only the file index. This is why you could see the artifacts in the UI, but trying to download them would timeout after retrying a couple of times. I extended the PR to address this issue as well. If you want to try it, you have to resolve the data inconsistency. The easiest way to do this is simply destroy the docker composition and bringing everything up again (from https://github.com/mendersoftware/mender-server/pull/124/commits/2b2383fcfc5e9228ed3c7d16bcd3338b173959ee):
[!WARNING] This will destroy all the data for your running instance if you need an alternative see below
docker compose down -v --remove-orphans
Alternatively, you can destroy only the artifacts storage (which is the only corrupt part at this point):
# Destroy SeaweedFS with corrupt volume
docker compose down -v s3fs
# Remove releases/artifacts from the database
docker compose exec mongo mongosh --eval 'deployments = db.getSiblingDB("deployment_service"); deployments.images.deleteMany({}); deployments.releases.deleteMany({})'
# Bring up the composition again
docker compose up -d
Thank you for the new patch. Unfortunately, there are still some pending functionalities that need attention.
This will destroy all the data for your running instance if you need an alternative see below
I even tried your changes with destroying all the data and re-creating new users on a test setup.
Please find the logs attached _mender-gui-1_logs (4).txt with your changes applied on top of the main
branch, commit cd5f6108bb51345d12e67816f13ee1b4507c986c.
$ git diff
diff --git a/docker-compose.yml b/docker-compose.yml
index 16ae390..d6e6791 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -365,9 +365,16 @@ services:
labels:
traefik.enable: "true"
traefik.http.routers.s3fs.priority: "99999"
- traefik.http.routers.s3fs.rule: HostRegexp(`s3\..*`)
+ traefik.http.routers.s3fs.rule: PathPrefix(`/mender`)
traefik.http.services.s3fs.loadBalancer.server.port: "8333"
- command: [server -s3 -s3.config /etc/seaweedfs/s3.conf]
+ command:
+ - server
+ - -dir=/data
+ - -master.electionTimeout=1s
+ - -master.heartbeatInterval=250ms
+ - -master.raftHashicorp=true
+ - -s3
+ - -s3.config=/etc/seaweedfs/s3.conf
healthcheck:
test:
- CMD
@@ -375,6 +382,8 @@ services:
- "-z"
- "127.0.0.1"
- "8333"
+ start_period: 1m
+ start_interval: 1s
retries: 10
client:
Sorry about the mess. I thought I fixed it (again), but upon further investigation I found that the SeaweedFS deployment sometimes would not start (it appeared to be stuck in a deadlock). I did some more extensive refactoring of the SeaweedFS (S3) deployment which seems to make it a lot more reliable, could you try my new PR #136? :crossed_fingers:
I cannot see anything suspicious from the gui
service logs. However, I think the issues you're experiencing is not tied to this service but rather one of the backend services. If my next PR doesn't fix it, it would be helpful to see the full logs (docker compose logs
) including all services. Just make sure you redact any sensitive information (like your email, IP addresses etc.) before you upload.
Negative, your new container for SeaweedFS seems to hang.
Please find the logs refac-seaweedfs.log
I am pointing to these changes:
$ git log
commit 5c078153058706360766b4e98256957cc019a43a (HEAD -> refac-seaweedfs, origin/refac-seaweedfs)
Cheers 👍
Hmm.. This time I'm not able to reproduce the issue, but the logdump makes it clear that this is an issue related to SeaweedFS. Did you try deleting all volumes and starting fresh?
I changed the raft implementation to lower the startup time (and it generally seem more mature), that could however mess up the old raft algorithm's state.
If the problem persists, could you increase the log verbosity for the s3-master
service
diff --git a/compose/docker-compose.seaweedfs.yml b/compose/docker-compose.seaweedfs.yml index 28ded674..1a119134 100644 --- a/compose/docker-compose.seaweedfs.yml +++ b/compose/docker-compose.seaweedfs.yml @@ -26,6 +26,7 @@ services: s3-master: image: chrislusf/seaweedfs command: + - -v=5 - master - -mdir=/data - -ip=s3-master
I’m pleased to report that the initial results look promising after tearing down the Docker composition and bringing everything back up. I’m now able to download artifacts manually, and the deployment process is working smoothly!
I’ll run a few more tests later, write up a summary, and then close the ticket.
@alfrunes Thank you so much for your dedication and excellent work. 🥇
Excellent! I'm glad to hear things are looking promising. Let me know how it goes.
Cheers!
As I mentioned in my previous comment, the mender-server
is successfully running behind a reverse proxy, and the artifacts deployment with my setup is functioning smoothly. Thank you once again for your prompt support in resolving the issue. 👍 🥇
With main branch on hash 7dfc4d8501f58142cb0d45595ea3d0163908efa6 mender-client is not able to connect to the mender-server deployed in the production mode.
mender-client connection is tested on the target device using a local IP with mender-server running on the same network
Using above command leads to no connection and no new device pending for approval on mender-server running in the same network.
Workaround: Using Mender Server with Cloudflare Reverse Proxy mender-client is able to connect mender-server by disabling https scheme and websecure.
With https scheme and websecure.https.tls being disabled in mender-server config, we are relying on the security layer of a CDN service (cloudflare) using secure cloudflare tunnel (Cloudflare reverse proxy). This setup allows us to access mender-server using https://devices.domain.xxx pointing to http://192.168.x.x:**443** and install mender-client on the target device using:
Is this a recommend workflow to use mender-server or will you recommend to rely on mender-server https and webecure due to security concerns?
Thank you in advance for any inputs.
Cheers, N.T.