biblibre / urungi

Lightweight Open Source Business Intelligence and reporting tool for PostgreSQL, MySQL, SQL Server, Oracle Database
https://urungi.org
GNU General Public License v3.0
115 stars 42 forks source link

Express / Node.js always responds with HTTP status code 200 #341

Closed a-roussos closed 1 month ago

a-roussos commented 5 months ago

Describe the bug We want to set up fail2ban in order to block unwarranted HTTP requests towards our Urungi 3.2.2 installation. Ideally, such requests would result in a 4xx HTTP response status code which can then be caught by one of fail2ban's filters. Alas, the Express / Node.js backend responsible for serving the Urungi code always returns a 200, making it impossible to distinguish between "good" and "bad" requests.

My research so far tells me that it's standard behaviour for Express / Node.js to return status code 200: javascript - Express: Is it necessary to respond with status 200? - Stack Overflow

I've also found a reference in Express' FAQ which covers the handling of 404 responses: Express FAQ - How do I handle 404 responses? node.js - How to redirect 404 errors to a page in ExpressJS? - Stack Overflow

To Reproduce Just issue the following command on the system where your Node.js process is running: curl -I http://localhost:8080/nonexistentpath ... and in the response headers you will notice an HTTP response status code of 200.

Expected behavior When you visit a non-existent URL under your Urungi installation, you should get a 404 ("Not Found") back.

Server (please complete the following information)

Client (please complete the following information)

Additional context We're observing what we consider to be a significant amount of unexpected web traffic hitting a development Urungi 3.2.2 installation we recently just set up as a Docker container on a Hetzner VPS (and have not yet publicised in any way).

When we look at our web server logs (nginx 1.25.5 acting as an HTTP/2-enabled reverse proxy, also a container) we can tell that, for the most part, the culprits are bots/scanners which attempt to connect to our public-facing IPv4/IPv6 addresses directly and then try to exploit various vulnerabilities (unrelated to Urungi). We can tackle this by adjusting the nginx configuration, making such requests result in a 444 status code ("No Response", specific to nginx) which fail2ban can detect and act upon accordingly.

However, on some occasions we've also witnessed "legitimate" requests pointing to the actual domain we set up and associated with this (brand new) development installation. As we use a non-wildcard Let's Encrypt certificate to secure our site, my theory is that pages such as https://crt.sh/ can be harvested to obtain lists of newly-registered hosts to form the basis for such an "attack".

a-roussos commented 5 months ago

The following code segments might be of interest: https://github.com/biblibre/urungi/blob/567a6d519853df45c087fff2c2deb9d996166e46/server/app.js#L69-L71 https://github.com/biblibre/urungi/blob/567a6d519853df45c087fff2c2deb9d996166e46/server/app.js#L136-L140

And here's how I have attempted to fix this (I'm not an Express / Node.js expert by any means!):

diff --git a/server/app.js b/server/app.js
index 103fef95..2cc94bc0 100644
--- a/server/app.js
+++ b/server/app.js
@@ -136,7 +136,13 @@ for (const routesModule of routesModules) {
 // Catch-all route, it should always be defined last
 app.get('*', function (req, res) {
     res.cookie('XSRF-TOKEN', req.csrfToken(), { sameSite: true });
-    res.render('index', { base: config.get('base') });
+    if ( req.originalUrl === '/' ||
+         req.originalUrl === '/login' ||
+         req.originalUrl === '/home' ) {
+        res.render('index', { base: config.get('base') });
+    } else {
+        res.sendStatus(404);
+    }
 });

 module.exports = app;
jajm commented 5 months ago

The problem here is that the backend does not know about the frontend URLs. So it renders the index page which contains the Angular.js app and let the client's browser do the rest. Doing what you suggest would prevent linking to (or reloading) pages other than the home page, ie. it will return a 404 for URLs like /dashboards/list whereas it is a valid URL

jajm commented 5 months ago

You might want to try the stop-angularjs branch which is in a work-in-progress state but is usable. It is an ongoing (though in pause at the moment) work to remove angular.js and it uses a more "traditional" way of handling URLs (no client-side routing). Unknown URLs will return a 404.

jajm commented 1 month ago

The stop-angularjs branch was merged recently. There is no release yet but the issue is fixed in master