tokio-rs / axum

Ergonomic and modular web framework built with Tokio, Tower, and Hyper
19.17k stars 1.06k forks source link

Path parsing doesn't handle `:` nor rfc3986 `sub-delims` #2128

Closed kriswuollett closed 1 year ago

kriswuollett commented 1 year ago

The closest issue I could find is #261.

Bug Report

Version

kris@slate axum % cargo tree | grep axum
axum v0.6.16 (/Users/kris/code/kriswuollett/axum/axum)
├── axum-core v0.3.4 (/Users/kris/code/kriswuollett/axum/axum-core)
│   ├── axum v0.6.16 (/Users/kris/code/kriswuollett/axum/axum) (*)
│   ├── axum-extra v0.7.4 (/Users/kris/code/kriswuollett/axum/axum-extra)
│   │   ├── axum v0.6.16 (/Users/kris/code/kriswuollett/axum/axum) (*)
│   │   ├── axum-core v0.3.4 (/Users/kris/code/kriswuollett/axum/axum-core) (*)
│   │   ├── axum-macros v0.3.7 (proc-macro) (/Users/kris/code/kriswuollett/axum/axum-macros)
│   │   │   ├── axum v0.6.16 (/Users/kris/code/kriswuollett/axum/axum) (*)
│   │   │   ├── axum-extra v0.7.4 (/Users/kris/code/kriswuollett/axum/axum-extra) (*)
│   │   ├── axum v0.6.16 (/Users/kris/code/kriswuollett/axum/axum) (*)
├── axum-macros v0.3.7 (proc-macro) (/Users/kris/code/kriswuollett/axum/axum-macros) (*)
├── axum-macros v0.3.7 (proc-macro) (/Users/kris/code/kriswuollett/axum/axum-macros) (*)
axum-core v0.3.4 (/Users/kris/code/kriswuollett/axum/axum-core) (*)
axum-extra v0.7.4 (/Users/kris/code/kriswuollett/axum/axum-extra) (*)
axum-macros v0.3.7 (proc-macro) (/Users/kris/code/kriswuollett/axum/axum-macros) (*)
kris@slate axum % git log -1
commit 64c566cd1c7ff9458a69de2274ae4833cda82666 (HEAD -> main, origin/main, origin/HEAD)
Author: tuhana <github@tuhana.me>
Date:   Thu Jul 27 17:03:10 2023 +0300

Platform

kris@slate axum % uname -a
Darwin slate.localdomain 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:22 PDT 2023; root:xnu-8796.121.3~7/RELEASE_X86_64 x86_64

Crates

Using git commit of axum repo at 64c566cd1c7ff9458a69de2274ae4833cda82666.

Description

I tried to capture a path segment in a URL that had another path segment with an embedded : in its name. According to RFC 3986:

 segment       = *pchar
 pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

Therefore the path segment capture should not process a segment like a:b in /:param/a:b/something, and axum should provide a way to escape : if it is the leading pchar in a segment. It seems like axum isn't percent decoding the path segments as well?

I tried this code:

    #[crate::test]
    #[should_panic]
    async fn captures_dont_include_path_with_inner_colon() {
        let app = Router::new().route(
            "/:instance/namespace_a:method_b",
            get(|Path(param): Path<String>| async move { param }),
        );

        let client = TestClient::new(app);

        let res = client.get("/default/namespace_a:method_b").send().await;
        assert_eq!(res.text().await, "default"); // FAILS: extracted 2 path arguments
    }

    #[crate::test]
    #[should_panic]
    async fn captures_dont_include_path_with_inner_colon_encoded() {
        let app = Router::new().route(
            "/:instance/namespace_a:method_b",
            get(|Path(param): Path<String>| async move { param }),
        );

        let client = TestClient::new(app);

        let res = client.get("/default/namespace_a%3Amethod_b").send().await;
        assert_eq!(res.text().await, "default"); // FAILS: extracted 2 path arguments
    }

    #[crate::test]
    #[should_panic]
    async fn captures_dont_include_path_with_inner_colon_percent_encoded() {
        let app = Router::new().route(
            "/:instance/namespace_a%3Amethod_b",
            get(|Path(param): Path<String>| async move { param }),
        );

        let client = TestClient::new(app);

        let res = client.get("/default/namespace_a%3Amethod_b").send().await;
        assert_eq!(res.text().await, "default");

        let res = client.get("/default/namespace_a:method_b").send().await;
        assert_eq!(res.text().await, "default"); // FAILS: extracted 2 path arguments
    }

Instead, this happened:

thread 'extract::path::tests::captures_dont_include_path_with_inner_colon' panicked at 'assertion failed: `(left == right)`
  left: `"Wrong number of path arguments for `Path`. Expected 1 but got 2. Note that multiple parameters must be extracted with a tuple `Path<(_, _)>` or a struct `Path<YourParams>`"`,
 right: `"default"`', axum/src/extract/path/mod.rs:655:9

Notes

I believe the above should be supported, but it's the first time I'm encountering a URL with a : in a path segment, so not absolutely sure. However, I do not expect to have time to write a PR to fix at the moment if this should be fixed.

If this is supported, then perhaps some details are missing in the docs and/or test cases.

jplatte commented 1 year ago

Duplicate of #1452, blocked on https://github.com/ibraheemdev/matchit/issues/23.

kriswuollett commented 1 year ago

Duplicate of #1452, blocked on ibraheemdev/matchit#23.

Thanks for the link to the matchit issue. So all path parsing responsibility is that library? Even before seeing it I thought I could work around the issue by percent encoding. Is parsing /look!this a problem in matchit too? It doesn't look like to me that ! needs to be percent encoded. See the test cases I added in #2129.

segment       = *pchar
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

...

sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="
davidpdrsn commented 1 year ago

Thanks for the link to the matchit issue. So all path parsing responsibility is that library?

Yep. All that is on matchit. Axum can't do anything about that.

kriswuollett commented 1 year ago

Thanks for the link to the matchit issue. So all path parsing responsibility is that library?

Yep. All that is on matchit. Axum can't do anything about that.

I made a PR for matchit, see https://github.com/ibraheemdev/matchit/pull/34. Sub-delim matching seems to work there in the current version except for * there, at least as far as their tests/tree.rs tests go, e.g., /sd!here for a path:

kris@slate matchit % git log -2
commit 47ec320f6a8a51fcb39af66b6d658ce443586dd6 (HEAD -> sub-delims-test, origin/sub-delims-test)
Author: Kristopher Wuollett <kris@appbiotic.com>
Date:   Fri Jul 28 13:24:52 2023 -0500

    Add sub-delim path tests

    Excluded `*` since it is known to not be supported yet, see #23.

commit 617e7450c62dcb1fc28ebd3ae5920e83da3a53e1 (upstream/master, origin/master, origin/HEAD, master)
Author: Ibraheem Ahmed <ibraheem@ibraheem.ca>
Date:   Mon May 22 20:38:18 2023 -0400

    update wildcard docs
kris@slate matchit % cargo test --all-targets --all-features                               
    Finished test [unoptimized + debuginfo] target(s) in 0.21s
     Running unittests src/lib.rs (target/debug/deps/matchit-167ba8d0bd0fe61e)

running 4 tests
test params::tests::ignore_array_default ... ok
test params::tests::no_alloc ... ok
test params::tests::heap_alloc ... ok
test params::tests::stack_alloc ... ok

test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/tree.rs (target/debug/deps/tree-94a600691b508ccd)

running 27 tests
test catchall_root_conflict ... ok
test backtracking_tsr ... ok
test catchall_static_overlap1 ... ok
test catchall_static_overlap2 ... ok
test catchall_off_by_one ... ok
test catchall_static_overlap3 ... ok
test blog ... ok
test child_conflict ... ok
test catchall_static_overlap ... ok
test double_params ... ok
test duplicates ... ok
test double_overlap_tsr ... ok
test invalid_catchall ... ok
test invalid_catchall2 ... ok
test basic ... ok
test double_overlap ... ok
test issue_12 ... ok
test issue_22 ... ok
test root_tsr ... ok
test more_conflicts ... ok
test root_tsr_static ... ok
test root_tsr_wildcard ... ok
test same_len ... ok
test unnamed_param ... ok
test tsr ... ok
test wildcard_conflict ... ok
test wildcard ... ok

test result: ok. 27 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/bench.rs (target/debug/deps/bench-c799fb547ed938ff)
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.

Testing Compare Routers/matchit
Success
Testing Compare Routers/path-tree
Success
Testing Compare Routers/gonzales
Success
Testing Compare Routers/actix
Success
Testing Compare Routers/regex
Success
Testing Compare Routers/route-recognizer
Success
Testing Compare Routers/routefinder
Success

     Running unittests examples/hyper.rs (target/debug/examples/hyper-543ff5f9354bfbb4)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
davidpdrsn commented 1 year ago

Are you saying it does work in matchit but not axum?

kriswuollett commented 1 year ago

Are you saying it does work in matchit but not axum?

Probably? Just from the tests I through together it looks like something like /this!here works in the matchit test cases I wrote, but didn't in the relevant #2129 test case here. Just wanted to verify while I still interested in the issue. :-)

Not absolutely sure since I threw together those tests fairly quickly without reading much code.

nickray commented 1 year ago

This would be great if fixable, as Google API guidelines recommend paths such as /some/path/to/resource:customMethod for "custom methods" which are not the usual CRUD: https://google.aip.dev/136.