woodpecker-ci / woodpecker

Woodpecker is a simple, yet powerful CI/CD engine with great extensibility.
https://woodpecker-ci.org
Apache License 2.0
4.28k stars 369 forks source link

Pipeline fails to start with "could not read Username for ..." #4241

Closed haath closed 3 weeks ago

haath commented 3 weeks ago

Component

server, agent

Describe the bug

Possibly related to #1905 since I see some similar logs.

This is a fresh install, and the pipelines fail at the clone stage with:

fatal: could not read Username for https://[forgejo_domain] No such device or address
exit status 128

OAuth works fine. I can log out and back in with no issues. Only the agent running the workflow seems to be having the issue.

Steps to reproduce

The setup is the following on two hosts.

Host A

This host houses Forgejo, the Woodpecker server, and the Caddy proxy for tls.

services:
  services:
  woodpecker:
    image: woodpeckerci/woodpecker-server:latest
    container_name: woodpecker
    ports:
      - 3001:8000
      - 9000:9000
    restart: unless-stopped
    networks:
      - forgejo
    extra_hosts:
      - [forgejo_domain]:172.17.0.1
    environment:
      # generic
      - WOODPECKER_OPEN=true
      - WOODPECKER_HOST=https://[woodpecker_domain]
      - WOODPECKER_ADMIN=...
      - WOODPECKER_AUTHENTICATE_PUBLIC_REPOS=true
      # forgejo
      - WOODPECKER_FORGEJO=true
      - WOODPECKER_FORGEJO_URL=https://[forgejo_domain]
      - WOODPECKER_FORGEJO_CLIENT=...
      - WOODPECKER_FORGEJO_SECRET=...
      # agent
      - WOODPECKER_MAX_WORKFLOWS=1
      - WOODPECKER_AGENT_SECRET=...
    volumes:
      - /mnt/ssd/woodpecker:/var/lib/woodpecker
      - 
  forgejo:
    image: codeberg.org/forgejo/forgejo:9.0
    container_name: forgejo
    environment:
      - USER_UID=1000
      - USER_GID=1000
      - FORGEJO__database__DB_TYPE=postgres
      - FORGEJO__database__HOST=postgres-forgejo:5432
      - FORGEJO__database__NAME=forgejo
      - FORGEJO__database__USER=forgejo
      - FORGEJO__database__PASSWD=forgejo
    restart: unless-stopped
    networks:
      - forgejo
    extra_hosts:
      - [woodpecker_domain]:172.17.0.1
    volumes:
      - /mnt/ssd/forgejo:/data
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro
    ports:
      - "3000:3000"
      - "222:22"
    depends_on:
      - forgejo_db

Host B

This host is on the same LAN, and runs the woodpecker agent.

Note that 192.168.0.15 is the address of host A.

docker run -d \
        --restart unless-stopped \
        --name $NAME \
        -v /var/run/docker.sock:/var/run/docker.sock \
        --privileged \
        -v /srv/woodpecker-agent:/etc/woodpecker \
        --add-host [forgejo_domain]:192.168.0.15 \
        --add-host [woodpecker_domain]:192.168.0.15 \
        -e WOODPECKER_SERVER=192.168.0.15:9000 \
        -e WOODPECKER_AGENT_SECRET=... \
        woodpeckerci/woodpecker-agent:latest agent

Expected behavior

No response

System Info

`{"source":"https://github.com/woodpecker-ci/woodpecker","version":"2.7.1"}`

Both hosts are running Debian 12 on amd64 cpus.

Additional context

The relevant logs when the pipeline starts:

Server logs

{"level":"error","repo_id":"1","pipeline_id":"7","workflow_id":"7","error":"sql: no rows in result set","time":"2024-10-22T07:40:19Z","message":"queue.Done: cannot ack workflow"}
{"level":"error","repo_id":"1","pipeline_id":"7","workflow_id":"7","error":"stream: not found","time":"2024-10-22T07:40:19Z","message":"done: cannot close log stream for step 38"}
{"level":"error","repo_id":"1","pipeline_id":"7","workflow_id":"7","error":"stream: not found","time":"2024-10-22T07:40:19Z","message":"done: cannot close log stream for step 39"}
{"level":"error","repo_id":"1","pipeline_id":"7","workflow_id":"7","error":"stream: not found","time":"2024-10-22T07:40:19Z","message":"done: cannot close log stream for step 40"}
{"level":"error","repo_id":"1","pipeline_id":"7","workflow_id":"7","error":"stream: not found","time":"2024-10-22T07:40:19Z","message":"done: cannot close log stream for step 41"}
{"level":"error","repo_id":"1","pipeline_id":"7","workflow_id":"7","error":"stream: not found","time":"2024-10-22T07:40:19Z","message":"done: cannot close log stream for step 42"}

Agent logs

{"level":"warn","repo":"haath/rust-template","pipeline":"7","workflow_id":"7","error":"rpc error: code = Unknown desc = workflow finished with error uuid=01JASJPRDXKEBT5A0V7PP73QCR: exit code 128","time":"2024-10-22T07:40:19Z","message":"cancel signal received"}

Validations

haath commented 3 weeks ago

Nevermind, I had some local DNS issue.