apache / apisix

The Cloud-Native API Gateway
https://apisix.apache.org/blog/
Apache License 2.0
14.46k stars 2.52k forks source link

help request: when use Chinese word as uri parameter got 404 #10555

Open wklken opened 11 months ago

wklken commented 11 months ago

Description

use

apisix:
    router:
        http: 'radixtree_uri_with_parameter'

then register a route like /api/test/prod/cn/:word/

then

$ curl "http://0.0.0.0:6006/api/test/prod/cn/%E4%B8%AD%E6%96%87/"

will got 404;

set the log level to debug, and add some log

2023/11/28 02:36:19 [info] 1859#1859: *115647 [lua] radixtree_uri_with_parameter.lua:69: match(): route match mode: radixtree_uri_with_parameter, client: 127.0.0.1, server: _, request: "GET /api/test/prod/cn/%E4%B8%AD%E6%96%87/ HTTP/1.1", host: "0.0.0.0:6006"

# api_ctx.var.uri 
2023/11/28 02:36:19 [info] 1859#1859: *115647 [lua] radixtree_uri_with_parameter.lua:71: match(): /api/test/prod/cn/中文/, client: 127.0.0.1, server: _, request: "GET /api/test/prod/cn/%E4%B8%AD%E6%96%87/ HTTP/1.1", host: "0.0.0.0:6006"

# api_ctx.var.real_request_uri
2023/11/28 02:36:19 [info] 1859#1859: *115647 [lua] radixtree_uri_with_parameter.lua:72: match(): /api/test/prod/cn/%E4%B8%AD%E6%96%87/, client: 127.0.0.1, server: _, request: "GET /api/test/prod/cn/%E4%B8%AD%E6%96%87/ HTTP/1.1", host: "0.0.0.0:6006"

# api_ctx.var.request_uri
2023/11/28 02:36:19 [info] 1859#1859: *115647 [lua] radixtree_uri_with_parameter.lua:73: match(): /api/test/prod/cn/中文/, client: 127.0.0.1, server: _, request: "GET /api/test/prod/cn/%E4%B8%AD%E6%96%87/ HTTP/1.1", host: "0.0.0.0:6006"

# it will use api_ctx.var.uri to call 
# uri_router:dispatch(api_ctx.var.uri, match_opts, api_ctx, match_opts)
# will not match
2023/11/28 02:36:19 [debug] 1859#1859: *115647 [lua] radixtree.lua:497: compare_param(): path_org: /api/test/prod/cn/:word/?
2023/11/28 02:36:19 [debug] 1859#1859: *115647 [lua] radixtree.lua:498: compare_param(): pcre pat: \/api\/test\/prod\/cn\/([\w\-_;:@&=!',\%\$\.\+\*\(\)]+)\/?
2023/11/28 02:36:19 [debug] 1859#1859: *115647 [lua] radixtree.lua:503: compare_param(): req_path: /api/test/prod/cn/中文/

and

$ curl "http://0.0.0.0:6006/api/test/prod/cn/中文/"
2023/11/28 03:33:48 [info] 1859#1859: *346048 [lua] radixtree_uri_with_parameter.lua:69: match(): route match mode: radixtree_uri_with_parameter, client: 127.0.0.1, server: _, request: "GET /api/test/prod/cn/中文/ HTTP/1.1", host: "0.0.0.0:6006"
2023/11/28 03:33:48 [info] 1859#1859: *346048 [lua] radixtree_uri_with_parameter.lua:71: match(): /api/test/prod/cn/中文/, client: 127.0.0.1, server: _, request: "GET /api/test/prod/cn/中文/ HTTP/1.1", host: "0.0.0.0:6006"
2023/11/28 03:33:48 [info] 1859#1859: *346048 [lua] radixtree_uri_with_parameter.lua:72: match(): /api/test/prod/cn/中文/, client: 127.0.0.1, server: _, request: "GET /api/test/prod/cn/中文/ HTTP/1.1", host: "0.0.0.0:6006"
2023/11/28 03:33:48 [info] 1859#1859: *346048 [lua] radixtree_uri_with_parameter.lua:73: match(): /api/test/prod/cn/中文/, client: 127.0.0.1, server: _, request: "GET /api/test/prod/cn/中文/ HTTP/1.1", host: "0.0.0.0:6006"
2023/11/28 03:33:48 [debug] 1859#1859: *346048 [lua] radixtree.lua:497: compare_param(): path_org: /api/test/prod/cn/:word/?
2023/11/28 03:33:48 [debug] 1859#1859: *346048 [lua] radixtree.lua:498: compare_param(): pcre pat: \/api\/test\/prod\/cn\/([\w\-_;:@&=!',\%\$\.\+\*\(\)]+)\/?
2023/11/28 03:33:48 [debug] 1859#1859: *346048 [lua] radixtree.lua:503: compare_param(): req_path: /api/test/prod/cn/中文/

and the resty/radixtree.lua

local function fetch_pat(path)
   ...
            -- See https://www.rfc-editor.org/rfc/rfc1738.txt BNF for specific URL schemes
            res[i] = [=[([\w\-_;:@&=!',\%\$\.\+\*\(\)]+)]=]

So, why use a decoded uri to call uri_router:dispatch, while the resty/radixtree.lua require a encoded uri?

Environment

wklken commented 11 months ago

and If I change the code, encode the uri before pass it to uri_router.dispatch, it works

apisix/http/route.lua

function _M.match_uri(uri_router, match_opts, api_ctx)
    core.table.clear(match_opts)
    match_opts.method = api_ctx.var.request_method
    match_opts.host = api_ctx.var.host
    match_opts.remote_addr = api_ctx.var.remote_addr
    match_opts.vars = api_ctx.var
    match_opts.matched = core.tablepool.fetch("matched_route_record", 0, 4)

    -- local ok = uri_router:dispatch(api_ctx.var.uri, match_opts, api_ctx, match_opts)
    -- should always be encoded here
    local ok = uri_router:dispatch(core.utils.uri_safe_encode(api_ctx.var.uri), match_opts, api_ctx, match_opts)
    return ok
end

It's a bug or feature?

shreemaan-abhishek commented 11 months ago

Can you raise a PR with this fix? This should be a valid fix if no other tests fail.