Closed ghost closed 2 years ago
Thanks for opening an issue! We'll look into this.
It's not immediately clear to us what is going on, so we'll need your help to understand it better.
Ideally, we need to be able to reproduce the bug in the most minimal way possible. This allows us to write regression tests to verify the fix is working. If we can't reproduce it, then you'll have to test our changes for us until it's fixed -- and then we can't add test cases, either.
I've attached a template below that will help make this easier and faster! It will ask for some information you've already provided; that's OK, just fill it out the best you can. :+1:
I've also included some helpful tips below the template. Feel free to let me know if you have any questions!
Thank you again for your report, we look forward to resolving it!
## 1. Environment
### 1a. Operating system and version
```
paste here
```
### 1b. Caddy version (run `caddy version` or paste commit SHA)
```
paste here
```
### 1c. Go version (if building Caddy from source; run `go version`)
```
paste here
```
## 2. Description
### 2a. What happens (briefly explain what is wrong)
### 2b. Why it's a bug (if it's not obvious)
### 2c. Log output
```
paste terminal output or logs here
```
### 2d. Workaround(s)
### 2e. Relevant links
## 3. Tutorial (minimal steps to reproduce the bug)
Environment: Please fill out your OS and Caddy versions, even if you don't think they are relevant. (They are always relevant.) If you built Caddy from source, provide the commit SHA and specify your exact Go version.
Description: Describe at a high level what the bug is. What happens? Why is it a bug? Not all bugs are obvious, so convince readers that it's actually a bug.
Tutorial: What are the minimum required specific steps someone needs to take in order to experience the same bug? Your goal here is to make sure that anyone else can have the same experience with the bug as you do. You are writing a tutorial, so make sure to carry it out yourself before posting it. Please:
curl
.Example of a tutorial:
Create a config file: ``` { ... } ``` Open terminal and run Caddy: ``` $ caddy ... ``` Make an HTTP request: ``` $ curl ... ``` Notice that the result is ___ but it should be ___.
The environment I encountered this issue on:
xcaddy build --with github.com/caddy-dns/cloudflare
But I've reproduced it on an Ubuntu 20.04 system with caddy v2.4.3 (installed from dl.cloudsmith.io apt repository) with the following config:
hmbp.owo69.me:80 {
log
respond "test1"
}
*.hmbp.owo69.me:80 {
respond "test2"
}
I tested with curl on my computer:
Notice only the first block has the log directive, but when I requested a hostname on the wildcard, it logged as well:
root@hmbp:/etc/caddy# caddy run --config /etc/caddy/Caddyfile
2021/07/30 05:00:17.572 INFO using provided configuration {"config_file": "/etc/caddy/Caddyfile", "config_adapter": ""}
2021/07/30 05:00:17.573 WARN input is not formatted with 'caddy fmt' {"adapter": "caddyfile", "file": "/etc/caddy/Caddyfile", "line": 2}
2021/07/30 05:00:17.574 INFO admin admin endpoint started {"address": "tcp/localhost:2019", "enforce_origin": false, "origins": ["[::1]:2019", "127.0.0.1:2019", "localhos
t:2019"]}
2021/07/30 05:00:17.574 INFO http server is listening only on the HTTP port, so no automatic HTTPS will be applied to this server {
"server_name": "srv0", "http_port": 80}
2021/07/30 05:00:17.575 INFO autosaved config (load with --resume flag) {"file": "/root/.config/caddy/autosave.json"}
2021/07/30 05:00:17.575 INFO serving initial configuration
2021/07/30 05:00:17.575 INFO tls.cache.maintenance started background certificate maintenance {"cache": "0xc0003173b0"}
2021/07/30 05:00:17.575 INFO tls cleaning storage unit {"description": "FileStorage:/root/.local/share/caddy"}
2021/07/30 05:00:17.576 INFO tls finished cleaning storage units
2021/07/30 05:00:32.995 INFO http.log.access handled request {"request": {"remote_addr": "8.45.42.152:54273", "proto": "HTTP/1.1", "method": "GET", "host": "hmbp.owo69.me",
"uri": "/", "headers": {"User-Agent": ["curl/7.74.0"], "Accept": ["*/*"]}}, "common_log": "8.45.42.152 - - [29/Jul/2021:22:00:32 -0700] \"GET / HTTP/1.1\" 200 5", "duration": 0
.000141957, "size": 5, "status": 200, "resp_headers": {"Server": ["Caddy"], "Content-Type": []}}
2021/07/30 05:00:38.579 INFO http.log.access handled request {"request": {"remote_addr": "8.45.42.152:30929", "proto": "HTTP/1.1", "method": "GET", "host": "bruh.hmbp.owo69.
me", "uri": "/", "headers": {"User-Agent": ["curl/7.74.0"], "Accept": ["*/*"]}}, "common_log": "8.45.42.152 - - [29/Jul/2021:22:00:38 -0700] \"GET / HTTP/1.1\" 200 5", "duratio
n": 0.000073358, "size": 5, "status": 200, "resp_headers": {"Server": ["Caddy"], "Content-Type": []}}
I'm seeing the same behavior after having upgraded from 2.4.0 to 2.4.3 (using the -alpine
Docker image), although for me it seems to be unrelated to wildcard domains: I have some server blocks using the log
directive and others that don't use it, but it seems as if enabling it for a single block now enables it for all blocks:
get.offen.dev {
log
reverse_proxy {
to get:80
header_down -Server
}
}
offen.offen.dev, offen.frederikring.com, offen.analyticstxt.org {
reverse_proxy offen_offen:80
}
Both blocks will now log access logs.
EDIT: I tried reproducing this locally, but it behaved as expected there. It still does happen for my deployment though, although I am unsure what's causing it here. This is the full Caddyfile I am using:
{
email hioffen@posteo.de
admin off
}
storage.offen.dev {
reverse_proxy {
to minio:9000
header_down -Server
header_down -X-Amz-Request-Id
}
header X-Forwarded-Proto {scheme}
header X-Forwarded-Host {host}
header Host {host}
log
}
git.offen.dev {
reverse_proxy {
to gitea:3000
}
log
}
focalboard.offen.dev {
reverse_proxy {
to focalboard:8000
}
log
}
vault.offen.dev {
reverse_proxy {
to vaultwarden:80
}
log
}
demo.offen.dev {
reverse_proxy {
to minio:9000
header_down -Server
header_down -X-Amz-Request-Id
header_down -X-XSS-Protection
header_down -X-Content-Security-Policy
}
rewrite * /demo/demo.sh
log
}
get.offen.dev {
log
reverse_proxy {
to get:80
header_down -Server
}
}
offen.niefeld.com, deutsch.offen.dev {
reverse_proxy offen_niefeld:80
}
offen.offen.dev, offen.frederikring.com, offen.analyticstxt.org {
reverse_proxy offen_offen:80
}
docs.offen.dev, staging.offen.dev, www.offen.dev {
map {host} {bucket} {
docs.offen.dev "docs"
staging.offen.dev "web-staging"
www.offen.dev "web-production"
}
log
encode gzip
reverse_proxy {
to minio:9000
header_down -Server
header_down -Content-Security-Policy
header_down -X-Amz-Request-Id
@notfound status 404
handle_response @notfound {
rewrite * /{bucket}/404.html
reverse_proxy minio:9000 {
@200 status 200
handle_response @200 404
}
}
}
header {
Permissions-Policy interest-cohort=()
}
@noslash {
not expression {path}.endsWith("/")
path_regexp ^[a-z\-/]*$
}
@with_query {
not expression {query}.size() == 0
}
handle @noslash {
redir @with_query {path}/?{query} permanent
redir {path}/ permanent
}
@legacy_deep_dive {
host www.offen.dev staging.offen.dev
path /deep-dive/
}
handle {
redir @legacy_deep_dive https://{host}/about/ permanent
@document expression {path}.endsWith("/")
route @document {
rewrite {path}index.html
}
rewrite * /{bucket}{path}
}
}
offen.dev {
redir https://www.offen.dev{uri} permanent
}
Here, all server blocks will log, even the ones that do not have a log
directive.
Ah, I see the issue.
To understand how it works, it's best to first adapt the Caddyfile to JSON with caddy adapt --pretty
:
{
"apps": {
"http": {
"servers": {
"srv0": {
"listen": [
":80"
],
"routes": [
{
"match": [
{
"host": [
"hmbp.owo69.me"
]
}
],
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"body": "test1",
"handler": "static_response"
}
]
}
]
}
],
"terminal": true
},
{
"match": [
{
"host": [
"*.hmbp.owo69.me"
]
}
],
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"body": "test2",
"handler": "static_response"
}
]
}
]
}
],
"terminal": true
}
],
"logs": {
"skip_hosts": [
"*.hmbp.owo69.me"
]
}
}
}
}
}
}
See the skip_hosts
list at the bottom, that's what Caddy uses to filter out logging from the site blocks without log
.
Problem is, Caddy is just doing exact string comparisons for these:
To fix this, we probably just need to use the same kind of logic the host
matcher uses, which involves splitting the hosts by their .
parts, and making *
parts match any non-empty value.
Since this is a simple skip list though, we might still run into trouble if there's overlap, e.g. foo.example.com
has logging enabled, but *.example.com
doesn't, cause if *.example.com
is in the skip list, then foo.example.com
will not get logged.
Hogging on to this, there seems to be another issue with the config I posted above. When adapt
ing the Caddyfile skip_hosts
is missing some hosts that do not have a log
directive:
"logs": {
"skip_hosts": [
"offen.niefeld.com",
"deutsch.offen.dev",
"offen.dev"
]
}
which is missing the hostnames defined in this block:
offen.offen.dev, offen.frederikring.com, offen.analyticstxt.org {
reverse_proxy offen_offen:80
}
Should I move this to a new issue or is it somehow related?
I looked into what's happening at code level in my case (all of this in https://github.com/caddyserver/caddy/blob/c131339c5cda3a541223fde4a714aab55de13b9a/caddyconfig/httpcaddyfile/httptype.go) and found the following:
Setup
does not have a log
directivesrv.Logs
therefore is nil
on this iterationlog
directive, creating the skip listlog
directive gets appended to the skip listI tried fixing this (which works for my config and passes all tests) by making sure the skipped hosts always get collected, but it's really not pretty and I would guess someone who understands the logic better could come up with something nicer:
diff --git a/caddyconfig/httpcaddyfile/httptype.go b/caddyconfig/httpcaddyfile/httptype.go
index 559d7b76..6c51db90 100644
--- a/caddyconfig/httpcaddyfile/httptype.go
+++ b/caddyconfig/httpcaddyfile/httptype.go
@@ -442,6 +442,10 @@ func (st *ServerType) serversFromPairings(
srv := &caddyhttp.Server{
Listen: p.addresses,
}
+ // In certain cases, hosts to skip for logging might appear before
+ // `srv.Log` gets initialized, so these get collected preemptively
+ // and assigned to the proper config if needed.
+ var srvLogConfigSkipHosts []string
// handle the auto_https global option
if autoHTTPS != "on" {
@@ -661,16 +665,16 @@ func (st *ServerType) serversFromPairings(
}
}
}
- if srv.Logs != nil && len(sblock.pile["custom_log"]) == 0 {
+ if len(sblock.pile["custom_log"]) == 0 {
// server has access logs enabled, but this server block does not
// enable access logs; therefore, all hosts of this server block
// should not be access-logged
- if len(hosts) == 0 {
+ if len(hosts) == 0 && srv.Logs != nil {
// if the server block has a catch-all-hosts key, then we should
// not log reqs to any host unless it appears in the map
srv.Logs.SkipUnmappedHosts = true
}
- srv.Logs.SkipHosts = append(srv.Logs.SkipHosts, sblockLogHosts...)
+ srvLogConfigSkipHosts = append(srvLogConfigSkipHosts, sblockLogHosts...)
}
}
@@ -706,6 +710,10 @@ func (st *ServerType) serversFromPairings(
}
srv.Routes = consolidateRoutes(srv.Routes)
+ if srv.Logs != nil {
+ srv.Logs.SkipHosts = srvLogConfigSkipHosts
+ }
+
I'd be happy to put this in a (possibly nicer) PR if it goes into the right direction.
Thank you very much for the PR, @m90 -- looks like we can close this then. Appreciate your contribution!
Actually no @mholt, only @m90's case was fixed, not @ledlamp's which is different.
See my comment, which describes the issue that still remains https://github.com/caddyserver/caddy/issues/4256#issuecomment-890040651
Thanks for the discussion/clarification. @ledlamp I have created #4606 -- can you please see if that resolves the issue for you? I just haven't had a chance to test it yet.
As I understand it, access logging is disabled until you add the
log
directive to each site you want to log. However, if I add thelog
directive to any site, all my wildcard domains start being logged too, even though they are in separate blocks and I don't want them to be logged. The only exception to this is when I enablelog
(only) in the site block for an IP address (port 80), then it does not cause the wildcard domains to log. So it seems there is a bug.