erlyaws / yaws

Yaws webserver
https://erlyaws.github.io
BSD 3-Clause "New" or "Revised" License
1.29k stars 267 forks source link

Yaws incorrectly handle path "/policies/%2f/HA" #337

Closed leoliu closed 5 years ago

leoliu commented 6 years ago

While browsing https://bugs.erlang.org/browse/ERL-636, I try it on a server with Yaws and it turns out Yaws normalise it wrong as well. Check pathinfo, server_path in Arg, which normalise /policies/%2f/HA to /policies/HA.

Thanks to essen on erlang irc for confirmation.

vinoski commented 5 years ago

I'll need more information before I can understand the problem here. As always, it's best if one provides a test case showing the problem, but if not, at least describing how to reproduce the problem is helpful.

leoliu commented 5 years ago

If I remember correctly I think it is as simple as curl -I YOURSERVER/policies/%2f/HA and on the server print out the #arg{} or specifically the pathinfo, server_path and whatnot. Here is a snippet from a server running Yaws master:

[{clisock,#Port<0.30986>},
 {client_ip_port,{{127,0,0,1},52475}},
 {req,{http_request,'HEAD',{abs_path,"/policies/%2f/HA"},{1,1}}},
 {orig_req,{http_request,'HEAD',{abs_path,"/policies/%2f/HA"},{1,1}}},
 {clidata,undefined},
 {server_path,"/policies/HA"},
 {querydata,undefined},
 {appmoddata,"policies/HA"},
 {docroot_mount,"/"},
 {cont,undefined},
 {pid,<0.23299.1>},
 {appmod_prepath,[]},
 {prepath,[]},
 {pathinfo,"/policies/HA"}]
vinoski commented 5 years ago

Great, thanks. I'll look into it more.

vinoski commented 5 years ago

Getting this particular example to work isn't difficult, but doing so breaks redirection tests, and I fear it could break a lot of existing user code since generally, Yaws decodes incoming URLs and encodes outgoing URLs. It should probably be more choosy about when it does that. Encoding isn't always correct, either, since it skips a couple problematic characters.

A proper fix for all this might require including a new setting somewhere to configure legacy URL encode/decode for users who want to keep the existing behavior. Overall, it requires some more thought, and I'd prefer to get input from @klacke and @capflam before making any decisions.

leoliu commented 5 years ago

I used to think this was non-issue i.e. edge cases that could largely be ignored.

Imagine these url structures:

/Dir A/Dir B/File Name

or

/Category A/Category B/Product Name

The texts between / are largely not under your (developer's) control.

So between standard-compliance and backward-compatibility, I'd opt for the former in this case though I am sympathetic to the latter.

vinoski commented 5 years ago

Thankfully, it's not as pervasive as I thought it was. Two areas — redirection and webdav — were both calling yaws_api:url_encode/1 a lot, the latter unconditionally on all resource names. Fixing them to not do that was easy, and the entire test suite now passes. I'll push a branch for this soon.

leoliu commented 5 years ago

BTW, I just tested the fix-337 branch it is working nicely. Thanks.

vinoski commented 5 years ago

Thanks @leoliu !