expressjs / express

Fast, unopinionated, minimalist web framework for node.
https://expressjs.com
MIT License
65.07k stars 15.54k forks source link

Rewriting req.url in router with prefix path results in missing slash (/) #4059

Open fchu opened 4 years ago

fchu commented 4 years ago

Hello, I've been trying to use a npm module (connect-history-api-fallback) and encounter problems that after a lot of debugging I believe boils down to this issue.

When you're in a router that has a mount path, doing a rewrite on the relative root ('') path result in a missing slash, eg with bug.js as below

const express = require('express')
const app = express()
const port = 3000

var router = express.Router();
router.get('/', function (req, res, next) {
    req.url = "/index.html"
    next()
});

app.use('/app', router);

app.use(function (req, res, next) {
    console.log("Rewritten path: " + req.url)
    res.end()
});

app.listen(port, () => console.log(`Example app listening on port ${port}!`))

Running the following commands:

node bug.js
curl localhost:3000/app

results in the rewritten path to be /appindex.html instead of /app/index.html

Environment:

"node": 12.10.0
"express": 4.17.1

This sounds like a bug to me, but maybe I'm doing something wrong? I'm not too expert in expressjs to know for sure (my hunch is the router code doesn't 'remount' the prefix path properly.

Thanks!

dougwilson commented 4 years ago

Hi @fchu thanks for the report. I see what you mean, though will need to investigate further on what (if any) a fix would look like. To explain what is happening, is that your original URL is /app (no trailing /). When the /app mount is encountered, the routing system will always make sure there is some path there (as in, it doesn't let it become an empty string), so it ends up as req.url === '/' inside the router. Of course, this would translate to a potential original URL of /app/, though that is not what it was.

Then req.url is set to /index.html inside of router, as far as the upper router knows, that / at the beginning was the fake / it injected to ensure that req.url was not an empty string, and so is trimming it back away.

A quick work-around would be to change /app to /app/ in a rewrite above the /app mount.

Anyway, we will investigate further to see if there is a good solution for this issue, but just wanted to put down the above thoughts and a work-around that may get you unblocked while we work on it.

fchu commented 4 years ago

Hi @dougwilson thanks for the prompt reply. I already have a workaround in place in the form of a rewrite like you suggested, so this is not pressing, but I'll add my few thoughts as well in case it could be useful.

I think the root conceptual issue is whether a route is made of paths, or strings:

  1. if they are paths, then /app and /app/ are equivalent, and so are '' and /, and any logic for (un/)mounting route prefixes should be insensitive to trailing slashes
  2. if they are strings, then there is nothing special to slashes, and route manipulation should strictly be about concatenation of partial routes.

Now I think when the router is in strict mode, it's opting for 1, which is fine for most people. In that case though, maybe the (un/)mounting of path prefixes should employ logic equivalent to url.resolve instead of manual trimming of slashes and keeping track of that state.

Situation 2 is tricker, as I think one big challenge is that after a mounting point '' and '/' are fundamentally different endpoints, and I'm not sure to whether express can handle the different well throughout.

Right now, the router seem to follow both paradigms, if that makes sense. Hope it helps!

sarthak0906 commented 4 years ago

Is anyone working on this issue? I would like to work on it.

aelmardhi commented 1 year ago

when router adds removed back it add them here

req.url = protohost + removed + req.url.slice(protohost.length)

and the values in this example

req.url = 'index.html' // the leading slash was removed because router thenk it injected
removed = ' /app'
protohost = ''

a way to solve this is instead of concatinating strings use the node:path module to join the three parts. or create custom code to do that , but it should cover all scenarios (slashes and not). the problem of using the module is confimity with the rest of the code

also before deleting added slash we could add check that the first char is a slash