AspenWeb / pando.py

Filesystem dispatch + Simplates + Python = a nice web framework.
http://aspen.io/
Other
149 stars 38 forks source link

// gives homepage #170

Open chadwhitacre opened 11 years ago

chadwhitacre commented 11 years ago

After running make doc, I would expect http://localhost:5370// to give me a 404, but instead it gives me the homepage.

sigmavirus24 commented 11 years ago

Some websites do that. I wouldn't say that this is a huge issue.

pjz commented 11 years ago

So, two conflicting takes on this:

1) RFC 2396 says that URI path separator is a single slash. 2) POSIX path definition says that "Multiple successive slashes are considered to be the same as one slash".

Since we stand at an intersection of the two, I vote that we go with whatever's implemented, which at the moment seems to be the second option.

chadwhitacre commented 11 years ago

Interesting. A twist: someone registered the empty string as their username on Gittip. I would expect this to show me their profile:

https://www.gittip.com//

bruceadams commented 11 years ago

Treating double slashes as a single slash in URLs is very common practice. I've seen it in a bunch of places (mostly from bad code requesting URLs with repeated slashes and never getting fixed because it didn't break anything). It appears to be the default behavior of both Apache and Nginx and I'm pretty sure I've seen this behavior from IIS (I can't think of a quick way I can check that right now).

Playing with URLs in my existing browser tabs, Google, DuckDuckGo and others ignore the repeated slashes, Github gives a 404.

pjz commented 11 years ago

Okay, so the issue I see with this is: what if there are files: /.spt and /index.html.spt , and someone hits / ? which do they get? is there an implied empty string after every / ? and it overrides the fallback paths? Is .spt a valid filename? (note that it will be a 'hidden' file under unix) This seems orthogonal to the issue of // vs /, but I think it's related enough that if we answer it, it might give us a clue how to answer the // vs / problem.

lyndsysimon commented 11 years ago

It appears that Flask also treats multiple successive slashes as a single slash, for what it's worth.

pjz commented 11 years ago

I think this is fine as-is. If you really want to differentiate you can make a wildcard sptfile.

chadwhitacre commented 11 years ago

Doesn't seem right to me. http://www.example.com// should be 404.

chadwhitacre commented 11 years ago

I want https://www.gittip.com// to match %username with path['username'] set to ''.

chadwhitacre commented 11 years ago

Because some schmoe changed their username on Gittip to the empty string, and I want to be like, "Sure! Go ahead!" :-)

pjz commented 11 years ago

what if there are files: /.spt and /index.html.spt , and someone hits / ? which do they get? is there an implied empty string after every / ? and it overrides the fallback paths? Is .spt a valid filename? (note that it will be a 'hidden' file under unix)

pjz commented 11 years ago

If the only way to 'catch' that kind of filename is with a wildcard, I think we shouldn't do it.

chadwhitacre commented 11 years ago

is there an implied empty string after every / ?

No, there's an actual empty string between every //. :-)

bruceadams commented 11 years ago

Treating // as something different than / goes against defacto standards on the web. I'd love to find an RFC that speaks to this. (I have not yet found one.)

chadwhitacre commented 11 years ago

Found this slightly-related reference while working on #195:

The "/" character may be used within HTTP to designate a hierarchical structure.

http://www.ietf.org/rfc/rfc1738.txt

chadwhitacre commented 11 years ago

Also see "HIERARCHICAL FORMS" in http://www.ietf.org/rfc/rfc1630.txt:

      The slash ("/", ASCII 2F hex) character is reserved for the
      delimiting of substrings whose relationship is hierarchical.  This
      enables partial forms of the URI.  Substrings consisting of single
      or double dots ("." or "..") are similarly reserved.

      The significance of the slash between two segments is that the
      segment of the path to the left is more significant than the
      segment of the path to the right.  ("Significance" in this case
      refers solely to closeness to the root of the hierarchical
      structure and makes no value judgement!)

      Note

         The similarity to unix and other disk operating system filename
         conventions should be taken as purely coincidental, and should
         not be taken to indicate that URIs should be interpreted as
         file names.
pjz commented 11 years ago

sigh write me some failing tests into an issue170 branch and I'll see about making the dispatcher work correctly.

pjz commented 9 years ago

...so if autoindex is on, should a request for // give 404? or give the autoindex('//') -> autoindex('/') ? There's no possible way to make it give anything else without a wildcard simplate (%foo.spt), as you can't make a directory with an empty-string name.

And what if there are files: /.spt and /index.html.spt, and someone hits / ? which do they get? Logically I think /.spt would override the index.html.spt since the latter is a 'fallback' and the former is 'more particular'.

Is .spt a valid filename? (note that it will be a 'hidden' file under unix)

Also, since aspen mimics the filesystem mostly, I suspect people will be surprised when http://example.com/foo/bar != http://example.com/foo//bar

pjz commented 9 years ago

@whit537 ping. Design opinions needed.

chadwhitacre commented 8 years ago

Another example: https://gratipay.com/about//stats is currently 404.

chadwhitacre commented 8 years ago

Discussing on https://github.com/AspenWeb/salon/issues/8 (at about 40 minutes?) ... let's redirect // to / in an algorithm function.