caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
58.26k stars 4.03k forks source link

panic runtime error: invalid memory address or nil pointer dereference (v2.8.4) #6441

Closed oliemansm closed 4 months ago

oliemansm commented 4 months ago

Caddy v2.8.4 crashes periodically. At the moment twice a day. Not running a lot of traffic. Have to restart to get it up and running again. Didn't find any reports for this error in this version or any newer version I could try out.

Created docker image using this configuration:

FROM caddy:2.8.4-builder AS builder

WORKDIR /app
RUN xcaddy build --output caddy \
    --with github.com/corazawaf/coraza-caddy/v2@latest \
    --with github.com/trea/caddy-loki-logger

FROM caddy:2.8.4

Stack trace:

goroutine 14 [running]:
github.com/caddyserver/certmagic.(*Cache).maintainAssets.func1()
    github.com/caddyserver/certmagic@v0.21.3/maintain.go:50 +0x8c
panic({0x17b3a40?, 0x2c99ef0?})
    runtime/panic.go:770 +0x132
github.com/caddyserver/caddy/v2.(*filteringCore).Enabled(0x497d1d?, 0x0?)
    <autogenerated>:1 +0x1e
go.uber.org/zap.(*Logger).check(0xc00070ac80, 0x1, {0xc00535d100, 0x6fe})
    go.uber.org/zap@v1.27.0/logger.go:331 +0x6e
go.uber.org/zap.(*Logger).Warn(0x1a6819a?, {0xc00535d100?, 0xc001a21408?}, {0x0, 0x0, 0x0})
    go.uber.org/zap@v1.27.0/logger.go:254 +0x38
github.com/trea/caddy-loki-logger.LokiWriter.Write({0xc00030bb80?, 0xc00070ac80?}, {0xc001714400?, 0x19d, 0xc0005033c0?})
    github.com/trea/caddy-loki-logger@v0.3.0/writer.go:39 +0x1c5
go.uber.org/zap/zapcore.(*ioCore).Write(0xc0044f09c0, {0x0, {0xc19a30de81a33bb8, 0x13a57b49782e, 0x2cf0600}, {0xc0079929d8, 0x15}, {0x1a5cd1f, 0x20}, {0x0, ...}, ...}, ...)
    go.uber.org/zap@v1.27.0/zapcore/core.go:99 +0xb5
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc007562680, {0xc00014e600, 0x3, 0x3})
    go.uber.org/zap@v1.27.0/zapcore/entry.go:253 +0x11c
go.uber.org/zap.(*Logger).Info(0x1a25cc8?, {0x1a5cd1f?, 0x0?}, {0xc00014e600, 0x3, 0x3})
    go.uber.org/zap@v1.27.0/logger.go:247 +0x4e
github.com/caddyserver/certmagic.(*Config).updateARI(_, {_, _}, {{{0xc005560060, 0x2, 0x2}, {0x192a0a0, 0xc002b662d0}, {0x0, 0x0, ...}, ...}, ...}, ...)
    github.com/caddyserver/certmagic@v0.21.3/maintain.go:570 +0x1afb
github.com/caddyserver/certmagic.(*Cache).RenewManagedCertificates(0xc00070a380, {0x1fea2a0, 0xc000450c80})
    github.com/caddyserver/certmagic@v0.21.3/maintain.go:179 +0x125c
github.com/caddyserver/certmagic.(*Cache).maintainAssets(0xc00070a380, 0x0)
    github.com/caddyserver/certmagic@v0.21.3/maintain.go:71 +0x368
created by github.com/caddyserver/certmagic.NewCache in goroutine 1
    github.com/caddyserver/certmagic@v0.21.3/cache.go:127 +0x1f6
francislavoie commented 4 months ago

Thanks for opening an issue! We'll look into this.

It's not immediately clear to us what is going on, so we'll need your help to understand it better.

Ideally, we need to be able to reproduce the bug in the most minimal way possible. This allows us to write regression tests to verify the fix is working. If we can't reproduce it, then you'll have to test our changes for us until it's fixed -- and then we can't add test cases, either.

I've attached a template below that will help make this easier and faster! It will ask for some information you've already provided; that's OK, just fill it out the best you can. :+1:

I've also included some helpful tips below the template. Feel free to let me know if you have any questions!

Thank you again for your report, we look forward to resolving it!

Template

## 1. Environment

### 1a. Operating system and version

```
paste here
```

### 1b. Caddy version (run `caddy version` or paste commit SHA)

```
paste here
```

### 1c. Go version (if building Caddy from source; run `go version`)

```
paste here
```

## 2. Description

### 2a. What happens (briefly explain what is wrong)

### 2b. Why it's a bug (if it's not obvious)

### 2c. Log output

```
paste terminal output or logs here
```

### 2d. Workaround(s)

### 2e. Relevant links

## 3. Tutorial (minimal steps to reproduce the bug)

Helpful tips

  1. Environment: Please fill out your OS and Caddy versions, even if you don't think they are relevant. (They are always relevant.) If you built Caddy from source, provide the commit SHA and specify your exact Go version.

  2. Description: Describe at a high level what the bug is. What happens? Why is it a bug? Not all bugs are obvious, so convince readers that it's actually a bug.

    • 2c) Log output: Paste terminal output and/or complete logs in a code block. DO NOT REDACT INFORMATION except for credentials.
    • 2d) Workaround: What are you doing to work around the problem in the meantime? This can help others who encounter the same problem, until we implement a fix.
    • 2e) Relevant links: Please link to any related issues, pull requests, docs, and/or discussion. This can add crucial context to your report.
  3. Tutorial: What are the minimum required specific steps someone needs to take in order to experience the same bug? Your goal here is to make sure that anyone else can have the same experience with the bug as you do. You are writing a tutorial, so make sure to carry it out yourself before posting it. Please:

    • Start with an empty config. Add only the lines/parameters that are absolutely required to reproduce the bug.
    • Do not run Caddy inside containers.
    • Run Caddy manually in your terminal; do not use systemd or other init systems.
    • If making HTTP requests, avoid web browsers. Use a simpler HTTP client instead, like curl.
    • Do not redact any information from your config (except credentials). Domain names are public knowledge and often necessary for quick resolution of an issue!
    • Note that ignoring this advice may result in delays, or even in your issue being closed. 😞 Only actionable issues are kept open, and if there is not enough information or clarity to reproduce the bug, then the report is not actionable.

Example of a tutorial:

Create a config file: ``` { ... } ``` Open terminal and run Caddy: ``` $ caddy ... ``` Make an HTTP request: ``` $ curl ... ``` Notice that the result is ___ but it should be ___.
oliemansm commented 4 months ago

1. Environment

1a. Operating system and version

Alpine v3.20.1 (based on Caddy Docker image)

1b. Caddy version (run caddy version or paste commit SHA)

v2.8.4 h1:q3pe0wpBj1OcHFZ3n/1nl4V4bxBrYoSoab7rL9BMYNk=

1c. Go version (if building Caddy from source; run go version)

go version go1.22.5 linux/amd64 (from Docker image caddy:2.8.4-builder)

2. Description

2a. What happens (briefly explain what is wrong)

Caddy crashes periodically. At the moment twice a day. Not running a lot of traffic. Have to restart to get it up and running again. Judging by the stack trace it happens during cert maintenance.

2b. Why it's a bug (if it's not obvious)

Using normal and correct configuration and handling traffic just fine until out of nowhere it crashes with a panic with stack trace. Instead it shouldn't crash the entire webserver but keep serving traffic. Seems like a high priority bug to me.

2c. Log output

panic runtime error: invalid memory address or nil pointer dereference
goroutine 14 [running]:
github.com/caddyserver/certmagic.(*Cache).maintainAssets.func1()
    github.com/caddyserver/certmagic@v0.21.3/maintain.go:50 +0x8c
panic({0x17b3a40?, 0x2c99ef0?})
    runtime/panic.go:770 +0x132
github.com/caddyserver/caddy/v2.(*filteringCore).Enabled(0x497d1d?, 0x0?)
    <autogenerated>:1 +0x1e
go.uber.org/zap.(*Logger).check(0xc00070ac80, 0x1, {0xc00535d100, 0x6fe})
    go.uber.org/zap@v1.27.0/logger.go:331 +0x6e
go.uber.org/zap.(*Logger).Warn(0x1a6819a?, {0xc00535d100?, 0xc001a21408?}, {0x0, 0x0, 0x0})
    go.uber.org/zap@v1.27.0/logger.go:254 +0x38
github.com/trea/caddy-loki-logger.LokiWriter.Write({0xc00030bb80?, 0xc00070ac80?}, {0xc001714400?, 0x19d, 0xc0005033c0?})
    github.com/trea/caddy-loki-logger@v0.3.0/writer.go:39 +0x1c5
go.uber.org/zap/zapcore.(*ioCore).Write(0xc0044f09c0, {0x0, {0xc19a30de81a33bb8, 0x13a57b49782e, 0x2cf0600}, {0xc0079929d8, 0x15}, {0x1a5cd1f, 0x20}, {0x0, ...}, ...}, ...)
    go.uber.org/zap@v1.27.0/zapcore/core.go:99 +0xb5
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc007562680, {0xc00014e600, 0x3, 0x3})
    go.uber.org/zap@v1.27.0/zapcore/entry.go:253 +0x11c
go.uber.org/zap.(*Logger).Info(0x1a25cc8?, {0x1a5cd1f?, 0x0?}, {0xc00014e600, 0x3, 0x3})
    go.uber.org/zap@v1.27.0/logger.go:247 +0x4e
github.com/caddyserver/certmagic.(*Config).updateARI(_, {_, _}, {{{0xc005560060, 0x2, 0x2}, {0x192a0a0, 0xc002b662d0}, {0x0, 0x0, ...}, ...}, ...}, ...)
    github.com/caddyserver/certmagic@v0.21.3/maintain.go:570 +0x1afb
github.com/caddyserver/certmagic.(*Cache).RenewManagedCertificates(0xc00070a380, {0x1fea2a0, 0xc000450c80})
    github.com/caddyserver/certmagic@v0.21.3/maintain.go:179 +0x125c
github.com/caddyserver/certmagic.(*Cache).maintainAssets(0xc00070a380, 0x0)
    github.com/caddyserver/certmagic@v0.21.3/maintain.go:71 +0x368
created by github.com/caddyserver/certmagic.NewCache in goroutine 1
    github.com/caddyserver/certmagic@v0.21.3/cache.go:127 +0x1f6

2d. Workaround(s)

Haven't found any workaround. Need to restart the node to get it back up to server traffic.

2e. Relevant links

n/a

3. Tutorial (minimal steps to reproduce the bug)

I'll try to limit the example below to the most minimal config that'll reproduce the bug. Problem is it doesn't crash right away, but after six hours or so, so it'll take time to verify the minimal config. Hopefully based on the stack trace and other information you're already able to find a cause. Don't think it's related to the coraza or loki modules.

But the dns challenge might be related, although I don't see any reference to it in the stack trace. Using custom dns module because to support wild card domains. The DNS provider is PowerDNS compatible, but some minor differences, hence the usage of a custom module based on the existing ones.

FROM caddy:2.8.4-builder AS builder

WORKDIR /app
RUN xcaddy build --output caddy \
    --with github.com/corazawaf/coraza-caddy/v2@latest \
    --with github.com/trea/caddy-loki-logger \
    --with github.com/scopisto/caddy-powerdns@v1.1.4

FROM caddy:2.8.4

COPY --from=builder /app/caddy /usr/bin/caddy

RUN apk update && apk add --no-cache git busybox-extras openrc awall

RUN mkdir -p /etc/caddy/ruleset
RUN wget https://raw.githubusercontent.com/corazawaf/coraza/main/coraza.conf-recommended \
    -O /etc/caddy/ruleset/coraza.conf
RUN git clone https://github.com/coreruleset/coreruleset /etc/caddy/ruleset/coreruleset
RUN cp /etc/caddy/ruleset/coreruleset/crs-setup.conf.example \
    /etc/caddy/ruleset/crs-setup.conf

ADD Caddyfile /etc/caddy/Caddyfile

VOLUME /etc/caddy

#
# Install firewall
#
ADD /awall-policies/* /etc/awall/optional
VOLUME /etc/iptables

VOLUME /etc/awall/optional

WORKDIR /etc/caddy

ENV LOKI_ENVIRONMENT=""
ENV POWERDNS_SERVER_ID="previder"
ENV POWERDNS_SERVER_URL="https://portal.previder.nl/api/v2/drs/domain/dns"
ENV POWERDNS_API_TOKEN=""

COPY dockerrun.sh /usr/local/bin/dockerrun
RUN chmod +x /usr/local/bin/dockerrun
CMD ["dockerrun"]

Caddyfile

# The Caddyfile is an easy way to configure your Caddy web server.
#
# Unless the file starts with a global options block, the first
# uncommented line is always the address of your site.
{
    order coraza_waf first
    order reverse_proxy after file_server

    log default {
        output loki <loki-server-url> {
            label {
                application Caddy
                environment {env.LOKI_ENVIRONMENT}
            }
        }
    }

    servers {
        metrics
    }
}

(common) {
    coraza_waf {
        load_owasp_crs
        directives `
        Include /etc/caddy/ruleset/coraza.conf
        Include /etc/caddy/ruleset/crs-setup.conf
        Include /etc/caddy/ruleset/coreruleset/rules/*.conf
        SecRuleEngine On
        # Remove unnecessary Windows rules
        SecRuleRemoveByTag platform-windows
        `
    }

    log {
        output loki <loki-server-url> {
            label {
                application Caddy
                environment {env.LOKI_ENVIRONMENT}
            }
        }
    }

    file_server /.well-known/security.txt {
        root /var/www
    }

    @blocked {
        path /actuator*
        not remote_ip <ip-addresses>
    }

    error @blocked "Unauthorized" 403
}

(headers) {
    header_down Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"
    header_down X-Frame-Options "sameorigin"
    header_down X-Content-Type-Options "nosniff"
    header_down Referrer-Policy "same-origin"
}

# To use your own domain name (with automatic HTTPS), first make
# sure your domain's A/AAAA DNS records are properly pointed to
# this machine's public IP, then replace ":80" below with your
# domain name.
*.acceptatie.signalenportaal.nl, acceptatie.signalenportaal.nl {
    import common

    tls {
        dns powerdns {
            server_url {env.POWERDNS_SERVER_URL} 
            api_token {env.POWERDNS_API_TOKEN} 
            server_id {env.POWERDNS_SERVER_ID}
        }
    }

    reverse_proxy 10.34.6.172:8080 {
        import headers
    }
}

:2018 {
    metrics
}

# Refer to the Caddy docs for more information:
# https://caddyserver.com/docs/caddyfile
francislavoie commented 4 months ago

What if you comment out the loki logging? I'm not convinced this is a problem with Caddy itself.

mholt commented 4 months ago

Yep, that stack trace goes right through the loki log output module: github.com/trea/caddy-loki-logger.LokiWriter.Write

I would suggest opening an issue in @trea's repository for help. :slightly_smiling_face:

trea commented 4 months ago

Thanks for the ping and sorry for the trouble folks!

mholt commented 4 months ago

Oh no worries! Thank you for writing and maintaining a Caddy plugin!