caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
57.01k stars 3.99k forks source link

Using wildcards in the beginning and in the middle of paths has unexpected behavior #5029

Open capnspacehook opened 2 years ago

capnspacehook commented 2 years ago

Caddy version: v2.6.0-beta.3 GOOS: linux GOARCH: amd64

Found about the same time as I found #5028. If I have a path matcher that has wildcards in the beginning and in the middle of a path, expected paths do not match.

For example: path */containers/*/json does not match the request path /v1.41/containers/some_hash/json, which it probably should, or an error should probably be returned when parsing the config file.

Example Caddyfile:

http://localhost:9000 {
    log {
        output stdout
        level INFO
    }

    @dockerAPI {
        method GET
        path */containers/*/json
    }

    handle @dockerAPI {
        reverse_proxy unix//var/run/docker.sock
    }
    handle {
        respond 427
    }
}

After reading the documentation on path matchers, it came to my attention what I am trying to do might not be supported:

By request path, meaning the path component of the request's URI. Path matches are exact, but wildcards * may be used:

At the end, for a prefix match (/prefix/) At the beginning, for a suffix match (.suffix) On both sides, for a substring match (/contains/) In the middle, for a globular match (/accounts/*/info)

So I'm confused as to whether this is unsupported and an error should be returned (likely from what I'm reading) or supported and a bug.

francislavoie commented 2 years ago

Works for me :thinking:

:8882 {
    @foo path */containers/*/json
    respond @foo "YEP"
    respond "NOPE"
}
$ curl http://localhost:8882/containers/foo/json
YEP
$ curl http://localhost:8882/container/foo/json
NOPE   

Edit: Whoops, missed the version bit, sorry. Sec.

Edit2: Yeah okay path /*/containers/*/json matches /v1.41/containers/some_hash/json

I think * doesn't cross / boundaries when doing infix matching, so you do need the leading / to match. And I don't think the prefix match behaviour happens if you do have an infix, they end up all being "infix" regardless. I agree that's kinda misleading. WDYT @mholt?

mholt commented 2 years ago

Oh... interesting. Yeah, I don't think wildcards in the path matcher can combine a prefix or suffix match with an infix match. But it probably should be able to!

I'll work on that shortly.

capnspacehook commented 2 years ago

@francislavoie my bad, the example I posted above does work. This Caddyfile works to demonstrate the behavior I described above (based off of your simpler example):

http://localhost:9000 {
    @dockerAPI {
        path_regexp ^/v\d\.\d+/*
        path */containers/*/json
    }

    respond @dockerAPI "YEP"
    respond "NOPE"
}

It might be I'm using path_regexp and path together incorrectly, my thoughts were since those matchers are in a named matcher, they will be ANDed together and match URLs like /v1.41/containers/foo/json.

mholt commented 1 year ago

After looking into this some more, I'm not sure there's a quick fix -- we'd basically have to write our own globular parser, which I'm not interested in doing, especially given the use case(s) presented here, which can be matched as /*/containers/*/json currently; I think that's probably fine?

Crux of the issue is that filepath.Match() doesn't support ** patterns, and doesn't really do prefix/suffix matching that spans path components. Because of this we'd need to mutate the input strings so that it could help fit our needs, but I don't know what that mutation would be to make it work.