portainer / portainer

Making Docker and Kubernetes management easy.
https://www.portainer.io
zlib License
31.08k stars 2.48k forks source link

Cannot deploy from external environment - http: TLS handshake error - status_code=403 #8164

Closed ooliver1 closed 4 months ago

ooliver1 commented 1 year ago

Bug description When trying to deploy a stack from a seperate (non-local) agent environment, I receive a multitude of errors, the UI says failed to deploy a stack: listing workers for Build: failed to list workers: Unavailable: connection closed before server preface received

Expected behavior My stack to deploy just like it does in the local environment

Portainer Logs External:

2022/12/05 23:05:15 http: TLS handshake error from <local-env-ip>:37718: EOF

Local:

2022/12/05 11:05PM ERR github.com/portainer/portainer/api@v0.0.0-20221011053128-b0938875dc5c/http/client/client.go:94 > unexpected status code | status_code=403

Steps to reproduce the issue:

  1. Go to a non-local environment
  2. Go to stacks
  3. Add a new stack
  4. Click deploy
  5. Observe error after a moment

Technical details:

Additional context Add any other context about the problem here.

tamarahenson commented 1 year ago

@ooliver1

Can you provide the following: [1] Stack code [2] Is this a Docker Swarm or Docker Standalone environment? [3] Is the Agent running in the Environment?
[4] Screenshot of Toast [5] Screenshot of the Web Dev Tools Network tab that shows API request and response message [6] Are you using git as referenced in the related requests?

Related: https://github.com/portainer/portainer/issues/7240 and https://github.com/portainer/portainer/issues/7254

Thanks!

ooliver1 commented 1 year ago

Thank you @tamarahenson for the reply! #7240 seems to be unrelated, #7254 has the same error but the underlying logs include connection refused.

As I have mentioned before, this works on the local environment perfectly, but I would like this on the separate host.

  1. https://github.com/eludris/thang
  2. Standalone
  3. Yes 94e81479b960 portainer/agent:2.16.2 "./agent" 56 minutes ago Up 56 minutes 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp portainer_agent
  4. image
  5. image
  6. Yes
tamarahenson commented 1 year ago

@ooliver1

I am able to reproduce the issue. Local worked without error. However, Docker Standalone Agent was not having it.

My message is a bit different though:

"failed to deploy a stack: time=\"2022-12-07T03:52:41Z\" level=warning msg=\"The \\\"DISCORD_CHANNEL_ID\\\" variable is not set. Defaulting to a blank string.\"\ntime=\"2022-12-07T03:52:41Z\" level=warning msg=\"The \\\"DISCORD_CHANNEL_ID\\\" variable is not set. Defaulting to a blank string.\"\ntime=\"2022-12-07T03:52:41Z\" level=warning msg=\"The \\\"DISCORD_TOKEN\\\" variable is not set. Defaulting to a blank string.\"\nkeydb Pulling \n40dd5be53814 Pulling fs layer \n4212448b221b Pulling fs layer \n84600…==========================>] 392B/392B\n126554b67430 Extracting [==================================================>] 392B/392B\n126554b67430 Pull complete \n85fcaa96f7c9 Extracting [==================================================>] 127B/127B\n85fcaa96f7c9 Extracting [==================================================>] 127B/127B\n85fcaa96f7c9 Pull complete \nkeydb Pulled \nlisting workers for Build: failed to list workers: Unavailable: connection closed before server preface received\n"
Screen Shot 2022-12-06 at 7 54 36 PM

I need to further investigate. I will update you as I learn more.

Thanks!

tamarahenson commented 1 year ago

@ooliver1

Looks like the Docker Standalone Local is ignoring not having any variables set.

And, Docker Standalone Agent is requiring them as seen here: https://github.com/Eludris/thang/blob/main/.env.example

Can you manually add these variables with your settings:

# REDIS_URL=redis://localhost:6379 # This is set to the service in docker compose.

# DISCORD_TOKEN= # Your Discord bot token
# DISCORD_CHANNEL_ID= # The id to the channel you want to send the messages to.
                    # This is temporary as eludris only has one channel.
# DISCORD_WEBHOOK_NAME="Eludris Bridge"  # The name of the webhook used to send messages.

# ELUDRIS_REST_URL=https://api.eludris.gay  # The HTTP uri for the eludris rest api.
# ELUDRIS_GATEWAY_URL=wss://ws.eludris.gay  # The websocket uri for the eludris gateway

via this method here and see if it deploys without error?

Screen Shot 2022-12-06 at 8 05 53 PM

Thanks!

ooliver1 commented 1 year ago

I have been doing that already and it is seemingly being saved to stack.env which is fine since I replaced env_file with substitution. As I said, it works on local but not elsewhere. The logs about tcp failed and 403 look like they are a bit more low level than that.

tamarahenson commented 1 year ago

@ooliver1

For Docker Standalone Agent: Can you send a screenshot of the variables entered and the error message on that screen? I can not further test due that I do not have the Discord variables/information.

For Docker Standalone Local: How are you entering the variables? My build worked without variables.

Thanks!

ooliver1 commented 1 year ago

I have sent the screenshot of the error above, I cannot give you the variables since one is a token, and another is useless without that token authorisation. For local I am doing the exact same - via the UI.

I'm not sure it is to do with my variables since they are identical, and the error is very ambiguous, along with the logs.

tamarahenson commented 1 year ago

@ooliver1

I wanted to follow up on this request. I figured out what the issue is. It is related to using build: . in the docker-compose.yml on Agent.

I am forwarding this to Product for review and am logging an internal request. I will update you as I learn more.

Thanks!

TsunayoshiSawada commented 1 year ago

Facing the same issue, we shifted half our business to portainer - added in a few servers as external docker nodes only to find I can't deploy any of my 90+ git repos to any remote server. This became a nightmare after portainer was pitched in as a perfect solution for all our deployments.

Bumping this request and subscribing to updates on this one.

tamarahenson commented 1 year ago

Update:

This issue is currently being reviewed by Product. I do not yet have a timeline or release.

Thanks!

paulschmeida commented 1 year ago

I can confirm that this is related to build directive. It also fails if a stack is created directly in portainer if it contains build directive. It fails on all environments connected via an agent.

ooliver1 commented 1 year ago

Just an update, this still occurs in 2.18.1

codesalatdev commented 1 year ago

I am in the same boat as @TsunayoshiSawada. Portainer seemed like a good solution for all projects until 30 minutes ago. Since portainer claims it can deploy stacks to agents, I didn't expect this to NOT work.

nosas commented 1 year ago

I'm encountering the same issue in the Community Edition: using relative build statements in my docker-compose.yml -> TLS error.

As of March 2023, Portainer supports relative builds in the Business Edition [1,2]. It is, hopefully, only a matter of time until it trickles down to the Community Edition.

I misread. The sources in my comment are for relative volume, not build, paths. Thanks for pointing it out.

The issue of relative build paths is still present.

ooliver1 commented 1 year ago

I use the business edition using the free nodes program, and it was an issue for me in April. The versions should have similar fixes. Relative volumes do work, but this is about relative build paths.

SeeJayEmm commented 1 year ago

Is there a work-around for this issue?

jamescarppe commented 1 year ago

Internal Ref: EE-4758

owojcikiewicz commented 1 year ago

Hi, is there an ETA on this fix? Or like @SeeJayEmm mentioned some workaround for the time being?

fatherofinvention commented 1 year ago

I just bumped up against this one. Hopefully somebody out there knows of a workaround?

owojcikiewicz commented 1 year ago

I just bumped up against this one. Hopefully somebody out there knows of a workaround?

I'm working on a GitHub Action that will deploy portainer stacks using the API as opposed to the Git integration which clearly doesn't work as expected.

It's still work in progress but I'll drop it here once it's done :)

jamescarppe commented 1 year ago

Those that are running into issues when deploying stacks that contain build directives with relative paths, can you confirm whether the same issue occurs when you are deploying with the Relative path volumes option enabled? Note this is a BE-only feature. I've been doing some testing around this and want to confirm what I'm seeing.

mathiaskpedersen commented 1 year ago

Those that are running into issues when deploying stacks that contain build directives with relative paths, can you confirm whether the same issue occurs when you are deploying with the Relative path volumes option enabled? Note this is a BE-only feature. I've been doing some testing around this and want to confirm what I'm seeing.

Hello, we're encountering this aswell. We've tried with and without Relative path volumes.

Portainer Version 2.19.2

This error shows up when deploying without Relative path volumes enabled: billede (2)

and this one appears with Relative path volumes enabled: billede (1)

nosas commented 10 months ago

The issue is that portainer does not support deploying a stack with relative build paths. For instance, OP's docker-compose.yml file contains the following relative build path:

services:
  discord:
    <<: *service
    build:
      context: .
      dockerfile: discord/Dockerfile

... truncated

Per portainer's docs, building images while deploying from a git repo is not supported yet[1].

There are two workarounds:

  1. Build the images individually[2] and reference the images in the docker-compose.yml file.
  2. SSH into the portainer agent, pull the repo, and execute docker compose manually.

The second workaround raises the following warning: "This stack was created outside of Portainer. Control over this stack is limited". You can still stop/start/restart the containers and view logs, but lose access to many portainer features.

Hopefully we can see improvements made to stack deployments via git in the near future.

1: https://portal.portainer.io/knowledge/can-i-build-an-image-while-deploying-a-stack/application-from-git 2: https://docs.portainer.io/user/docker/images/build

github-actions[bot] commented 8 months ago

This issue has been marked as stale as it has not had recent activity, it will be closed if no further activity occurs in the next 7 days. If you believe that it has been incorrectly labelled as stale, leave a comment and the label will be removed.

phiberoptick commented 8 months ago

Definitely not a stale issue. I literally ran into this last night and have spent the entire night fighting with Portainer BE. It wouldn't deploy from a git repo without selecting the relative paths option. Specifically I was trying to deploy stable-diffusion-webui-docker via Portainer instead of shell. Without selecting that option it wouldn't deploy giving errors about the context or not being able to find the build directory where the Dockerfile was.

After selecting it, it seemed to deploy and load me into a false sense of security. The stack even worked to a extent.

And when I went to investigate, I learned that I could no longer issue commands to the agent or the server that was controlling the agent. They were giving 408s, connection close by peer, timeouts etc.

I was unable to stop the service containers that were running inside the stack, detach the git repo, stop the stack, or delete it via portainer. I wasn't able to modify or manipulate images or running containers. It would always time out. This is across reboots, as well as restarting Docker. These steps were done to both the server and the agent machine. I was also unable to remove the offending agent's Environment either. Same result as any other attempt to make any changes.

Other than this portainer was quite responsive navigating around and doing anything that didn't require more than clicking through the interface.

I ended up having to remove all of the containers, images, and volumes related to that stack from shell. At which point Portainer showed they were gone but still would not let me detach the git repo, stop the stack (which it said was still running) or delete it. I confirmed that this behavior was still being exhibited on the server as well as the agent. Unable to commit any changes, pull any images, basically anything except for browse through the interface. I went ahead and tried restarting docker as well as the servers again and still same issue.

I was eventually able to regain control by purging the portainer agent. I restarted the docker services on both ends again as well to be thorough.

And I was STILL unable to use the server or remove the Environment. Once Portainer finally updated to show that Environment was unreachable, I was finally able to remove it from the server.

After, I deployed a new agent and both server and agent have seemed fine since. Also, I was able to deploy the stack but I went ahead and cloned the repo and built the images first and then used them instead of building the stack during deployment.

I know it's not much info, but since last night when I was beating my head, I've been subscribed to this issue. When I saw the bot comment this morning, the wound was still a little tender and I knew others were still having issues as well so rather than just bump, I wanted to try to give some info. Or at least a decent story...

ooliver1 commented 8 months ago

Yeah, the issue I personally find is not any sort of relative volumes or whatever, but the fact that the error is completely unrelated as it returns 403 without details.

github-actions[bot] commented 6 months ago

This issue has been marked as stale as it has not had recent activity, it will be closed if no further activity occurs in the next 7 days. If you believe that it has been incorrectly labelled as stale, leave a comment and the label will be removed.

SeeJayEmm commented 6 months ago

Still not stale.

github-actions[bot] commented 4 months ago

This issue has been marked as stale as it has not had recent activity, it will be closed if no further activity occurs in the next 7 days. If you believe that it has been incorrectly labelled as stale, leave a comment and the label will be removed.

github-actions[bot] commented 4 months ago

Since no further activity has appeared on this issue it will be closed. If you believe that it has been incorrectly closed, leave a comment mentioning portainer/support and one of our staff will then review the issue. Note - If it is an old bug report, make sure that it is reproduceable in the latest version of Portainer as it may have already been fixed.