lukeed / sirv

An optimized middleware & CLI application for serving static files~!
MIT License
1.07k stars 58 forks source link

Options to specify assets paths #44

Closed jakobrosenberg closed 4 years ago

jakobrosenberg commented 5 years ago

If an asset can not be resolved in a --single app, the whole app is requested instead.

Fix: Allow an asset prefix, so sirv treats anything within the asset path as an asset.

lukeed commented 5 years ago

How about if the incoming path doesn't have an extension, then assume it's meant to apply the fallback? Otherwise, if there an extension, assume it's being purposefully explicit and send the 404 handler instead.

jakobrosenberg commented 5 years ago

That's what I do with sites on "now", since I know it won't break anything. It could be different for other people's projects though. Not all assets may have extensions and some paths can have dot notation.

Imagine this path /users/:username and a username with a dot.

By allowing a specified path for assets, you're guaranteed that there won't be any nasty surprises. If an URL starts with ie. /assets, then you know exactly how to treat the request.

lukeed commented 5 years ago

Yeah, nevermind.

That's kinda silly tbh. Most applications have assets that belong in more than one place, so it's inviting more trouble.

Much better off just requesting the thing you want. What sirv always does (and will continue to do) is serve the asset it was asked for and then send a fallback if you had told it to

jakobrosenberg commented 5 years ago

The point of asset prefix(es)/pattern(s) is pretty much just for damage control in case of an unresolvable asset.

Assets may belong in more than one place, but

  1. Offsite assets don't matter
  2. Multiple places can still have the same root. Instead of that being workspace/public it could be workspace/public/assets
  3. in case point 2 isn't possible, just allow for a comma separated list
  4. it doesn't even have to a prefix. A pattern would be just as fine. Now uses regex as I recall.
public
--assets
----css
------utils.css
----media
------header.jpg

The current logic I assume looks like this:

If path can be resolved
  serve file
else
  serve app

The new logic would be near identical

If path can be resolved
  serve file
else if path is not whitelisted
  serve app

There's no downside to allowing patterns to be ignored in case an asset couldn't be resolved. But...

The list goes on and on and on, but you get my point.

Something as common as unresolved assets should not not have the potential to severely harm an app. Unresolved URLs that match an asset pattern should get simple 404s. Even in the case of an SPA with url-rewrite.

All that said. Thanks for the great services you provide. Gratitude doesn't come across well here, but trust me it's there. 😉

lukeed commented 5 years ago

Thank you :)

I'm leaning towards, if doing this, only looking at the --assets value if --single is enabled. When no --assets value is provided, then sirv will continue to apply everything with a fallback in "SPA mode". This is both non-breaking & easier to reason about, since as discussed above, it'd be impossible for sirv to safely assume what should & shouldn't actually exist.

The current logic is this (next):

let data = fn(pathname, extns, dir, isEtag) || isSPA && fn(fallback, extns, dir, isEtag);
if (!data) return next ? next() : isNotFound(req, res);

I'd change that to this, given the above:

let data = fn(pathname, extns, dir, isEtag) || isSPA && !isAsset(pathname) && fn(fallback, extns, dir, isEtag);
if (!data) return next ? next() : isNotFound(req, res);

For programmatic usage, this would mean that opts.assets takes a String, Function, RegExp, or an Array of the above to let you figure out if something should 404 or not.

This is a long preface to ask: Should opts.assets and --assets allow multiple prefixes? On the CLI, should we bother trying to accept RegExp patterns? That usually doesn't end well, imo

jakobrosenberg commented 5 years ago

Good point about the regexes. Even in json they can be problematic iirc. Something like matchit, fast-glob or minimatch would make more sense.

As for multiple patterns. While I personally would avoid it when possible, it makes sense to have and the internal logic is pretty much the same. A pipe or comma could be used as a delimiter. Commas are URL safe, but I doubt either will be used in a pattern.

minimatch example

Extension **/*.* Path /assets/** Combined /assets/**|**/*.*


//ballpark logic for the internals
import minimatch from 'minimatch'
const assets = options.assets.split('|')
...
function isAsset(pathname){
  return !!assets.filter(pattern => minimatch(pathname, pattern)).length
}
lukeed commented 4 years ago

Solved by #53