pillarjs / send

Streaming static file server with Range and conditional-GET support
MIT License
796 stars 188 forks source link

Filenames with periods/dots using extensions option #166

Closed js-choi closed 6 years ago

js-choi commented 6 years ago

Currently, the extensions option “is skipped if the requested file already has an extension” (readme).

I’d like to request adding a boolean option such as alwaysCheckExtensions, by default false, such that the extensions is never skipped, even if the requested filename contains periods/dots.

There are many cases where a filename may contain multiple dots, of which only the last would delimit the file extension. For instance, I want to use hyphens to separate most words in my HTML documents’ filenames, but I also want to delimit the different versions of the same HTML document using periods, such as understanding-cats.en-us.html and understanding-cats.en-uk.html. I would want to be able to serve these files as understanding-cats.en-us and understanding-cats.en-uk. People do similar things with differently sized versions of the same image (icon-256px versus icon-1028px) or for original versus minified versions of the same script (jquery versus jquery.min).

I ran into this problem using express.static / serve-static, which depends on and delegates its options to this library.

dougwilson commented 6 years ago

The extensions options is only designed to server extension-less URLs. Your example URLs do have extensions (the lang code), which is why it doesn't work for your use-case.

Based on your use-case description I wouldn't even use the extensions option at all. Remember it would require an extra fs stats ca for every single request if you do that, and stat calls use the Node.js thread pool so you can quickly bottleneck your app with excessive thread pool work.

If all the files have .html extension, just add the extension in the string you pass to send and it will serve the file in a single stat :+1: .

js-choi commented 6 years ago

Thanks for the reply, and thanks for creating this library! Minimizing calls to fs.stat is definitely a sensible goal for performance.

Having said that, I’m a little confused by the explanation. It is the extensions option in general that causes multiple fs.stat calls. As far as I can tell, a hypothetical alwaysTryExtensions option would not cause more fs.stat calls, at least usually.

What I’m essentially asking for is that the && !extname(path) in line 722 be replaced with something like && (this._alwaysTryExtensions || !extname(path)), where this._alwaysTryExtensions is set by opts.alwaysTryExtensions or false by default. I could make a pull request to show what I mean, if it would help.

Using the extensions option with n extensions already involves a maximum of n+1 calls to fs.stat per HTTP request: one call for the raw path and up to n recursive calls for the raw path concatenated with each item of this._extensions. And the average number of fs.stat calls per request depends on how early an actual request would short-circuit the list of extensions on average.

For instance, if this._extensions was ["extension1", "extension2"], then a request for a path without periods /no-periods-1-0 would try /no-periods-1-0, and if that failed then /no-periods-1-0.extension1, and if that failed then /no-periods-1-0.extension2. However, a request for a path with periods /yes-periods-1.0 would only try /yes-periods-1.0.

In this case, the maximum number of fs.stat calls per request for /no-periods-1-0 is 3. And—if the probability that a file exists at the actual path /no-periods-1-0 is 10%, the probability that a file exists at /no-periods-1-0.extension1 is 60%, and the probability that a file exists at /no-periods-1-0 is 30%—then the expected average number of fs.stat calls per /no-periods-1-0 request is 10% · 1 + 60% · 2 + 30% · 3, which is 2.2. This is already the case in the current version of send.

The maximum and average numbers of fs.stat calls per request for /yes-periods-1.0 would be 1. However, if the probabilities that a file exists at /yes-periods-1.0, /yes-periods-1.0.extension1, and /yes-periods-1.0.extension2 are also 10%, 60%, and 30%, then that means that users won’t be able to request and get the file 90% of the time.

Setting alwaysTryExtensions to be true would mean that the maximum and average numbers of fs.stat calls per /yes-periods-1.0 request would be 3 and 2.2, just like it already is for /no-periods-1-0 requests.

Setting alwaysTryExtensions wouldn’t dramatically increase the fs.stat bottleneck. It is already the case that multiple fs.stat calls per request occur when using extensions. With alwaysTryExtensions, the maximum number of fs.stat calls per request would still be n+1 (where n is the number of items in this._extensions).

…Unless I’m misunderstanding something about the source code, of course.

In the end, I want to emulate Nginx’s try_files directive such that I can serve files similarly to try_files $uri $uri.js $uri.css $uri.png $uri.jpg $uri.gif $uri.svg $uri.html. I want to allow my server’s users to have extensionless URLs while not limiting them from using periods in their filenames, like Nginx’s try_files also allows. As far as I can tell, an alwaysTryExtensions option would allow this without increasing calls to fs.stat any more than they would have had to anyway.

Thanks again for the reply, and thank you for creating this wonderful library! Hopefully you’ll be able to clarify further your worry about extra fs.stat calls with alwaysTryExtensions option that do not already occur with extensions.