uNetworking / uWebSockets

Simple, secure & standards compliant web server for the most demanding of applications
Apache License 2.0
17.24k stars 1.75k forks source link

Ability to replace HTTP route handler (regression? from v16) #1109

Closed lostrepo closed 3 years ago

lostrepo commented 3 years ago
  1. Linux kernel 5.7.X
  2. uWebSockets.js 18.13.X
  3. Node 12.18.X

In v16ish of uWebSockets.js I can replace HTTP route handler with new one at any time. In v18ish it doesn't work anymore. Just wondering if that feature will return in the future. Using it to live reload cached in memory gzipped buffers of files when they changed/deleted/etc.

Test script (assumes that uWebSocket.js versions located in local node_modules folder) Sorry for messy code

/*
    Test ability to replace HTTP route handler function
    ExitCode: 0 - success, 1 - failure
 */

'use strict';
var
cl  =   console.log.bind(console),
uws =   {
// don't forget to place v16 binaries into v16.uWebSockets.js folder or comment out that line
    v16ish: require('./node_modules/v16.uWebSockets.js/uws.js').App(),
    v18ish: require('./node_modules/uWebSockets.js/uws.js').App()
},
portStart           =   3000,
route               =   '/test',
portReplaceString       =   '@port@',
requestUrl          =   'http://localhost:'+ portReplaceString + route,
firstRouteHandlerResponse   =   'First route response',
replacedRouteHandlerResponse    =   'Replaced route response',
toTest              =   0,
exitCode            =   0;

for (var k in uws){
    var port    =   portStart + toTest++;

    uws[k]
    // setup first route
        .get(route, function(resp, req){
            return resp.end(firstRouteHandlerResponse);
        })
    // setup replaced route
        .get(route, function(resp, req){
            return resp.end(replacedRouteHandlerResponse);
        })
    // listen
        .listen(port, function(listenSocket){
            if (listenSocket){
                cl('Listening on port: '+ this);
            } else {
                cl('Failed to listen on port: '+ this);
                exitCode    =   1;
            }
        }.bind(port));

    // test that we get replaced route
    // listen should succeed before request attempt timeout
    // otherwise something wrong with listen
    require('http').get(requestUrl.replace(portReplaceString, port), function(resp){
        var data    =   [],
        version     =   this;

        resp.on('data', function(chunk){
            data.push(chunk);
        });

        resp.on('end', function(){
            if (data.join('') === replacedRouteHandlerResponse){
                cl('Route replace success for version '+ version);
            } else {
                cl('Route replace failure for version '+ version);
                exitCode    =   1;
            }

            if (--toTest == 0){
                process.exit(exitCode);
            }
        });
    }.bind(k));
}

setTimeout(function(){
    cl('Test timeout');
    process.exit(1);
}, 5000);

Running:

node route.replace.test.js && echo 'success' || echo 'failure'

Results in:

Listening on port: 3000
Listening on port: 3001
Route replace success for version v16ish
Route replace failure for version v18ish
failure
hst-m commented 3 years ago

Using it to live reload cached in memory gzipped buffers of files when they changed/deleted/etc

You should not need to replace the route for this, use the 1st route but inside the handler function add code to serve the new files when they change

hst-m commented 3 years ago

it should be as simple as this:

uws.App().get('/',(res,req)=>{
   res.end(file)
})

let file = 'version1'

// file changed
file = 'version2'
lostrepo commented 3 years ago

@hst-m It's just nice-to-have for me. If Alex Hultman says that it won't be brought back it's fine. I'll try to add it in C++ with my crippled hands, I already have some unpopular planned work in C++ land anyway.

  1. Replacing routes gives more room for JS optimization (pre-computed Status+Headers buffer, less chained function calls, etc.). Trying to make base JS as fast as possible since extra JS on top introduces only extra overhead.
  2. Also I'm planning to add server modules hot-reloading which will work better with route replacement instead of full instance re-initialization (have one tricky directory watch case to handle before that).
hst-m commented 3 years ago

I also have files being watched and serving latest file (for faster development etc) if your only argument was performance (1) I would say as in my example you have zero performance loss, including setting up headers etc. But if you have other reasons like (2) would need to ask @alexhultman. Figured I would pitch in in case you were missing something simple like the example

ghost commented 3 years ago

Adding removing routes has never been a documented feature, but maybe it should be. I think it could be relatively simple to fix this

lostrepo commented 3 years ago

Alex Hultman, great! I found some comment in sources hinting that they get replaced by new route back in the day while trying to resolve some routing issues. Wasn't sure but after some tests confirmed my assumptions.

@hst-m I've used proposed approach before, flexibility lower, performance lower. Just to compare flexibility and varying hot execution path available with route replacement feature:

// ...Ab - JS ArrayBuffer
const ifNoneMatchAb = ...;
const notModifiedStatusAndHeadersAb = ...;

// I use normal functions since they had better performance than arrow functions long time ago in v8,
// didn't pay attention if they were optimized to the level of normal functions or not,
// so playing safe in examples.

// imagine that we do that in some loop for every changed route so we can avoid useless guessing in hot path
// If Exists
(function(route, etag, respAb, statusAndHeadersAb){
   app.get(route, function(resp, req){
    if (etag == req.getHeader(ifNoneMatchAb)){
      // NotModified
      return resp.writeStatus(notModifiedStatusAndHeadersAb).end();
    }

    // Found
    return resp.writeStatus(statusAndHeadersAb).end(respAb);
  });
})('/test', etag, respAb, statusAndHeadersAb);

// If NotFound
(function(route, respAb, statusAndHeadersAb){
  app.get(route, function(resp){
    return resp.writeStatus(statusAndHeadersAb).end(respAb);
  });
})('/test', respAb, statusAndHeadersAb);

// If Redirect
(function(route, statusAndHeadersAb){
  app.get(route, function(resp){
    return resp.writeStatus(statusAndHeadersAb).end();
  });
})('/test', statusAndHeadersAb);

// If your condition
// Replace route handler with one from Your handler generator at any time
// So You can pre-compute data that You know shouldn't be computed in hot path
// No need to make any checks that You don't need
// And so on

// After changes we call Garbage Collector so we have less JS trash hanging around

It's old Memory footprint vs Runtime computation tradeoffs. Since NodeJS is weak in latter I prefer to pay with Memory if some hot path exists in code. In JS hot path shouldn't have any function calls (learned it hard way, optimizing audio recorder with canvas visualization for some ancient iPhone running modern enough Safari, hell exists). But code becomes unmaintainable without them, some NodeJS flags help a bit but still it adds up when you add logging, custom rate limiting based on Memory/CPU/RPS metrics, crazy business logic, own mistakes, etc.

hst-m commented 3 years ago

@lostrepo how many files are you setting routes for like this? if you have hundreds or thousands of files that is a large number of functions created/re-created and watched for GC

Object allocations, particularly function object allocations due to the heavy amount of internal data needed to implement them, can be taxing to performance. Allocated objects are not simply just sitting in memory but the garbage collector is constantly looking for unused objects so that they can be deallocated. The more memory you use in JavaScript the more CPU is being used to power the garbage collector and less CPU becomes available to run actual code

you could have a single function handling it:

app.get('/public/:file',(res,req)=>{
   const file=req.getParameter(0)
   if(!files[file]) return res.writeStatus('404').end()
   if(files[file].redirect) return res.writeStatus('301').writeHeader('Location',files[file].redirect).end()
   if(files[file].etag==req.getHeader('if-none-match')) return res.writeStatus('304').end()
   res.end(files[file].content)
})

const files={}
require('fs').watch('directory',(event,file)=>{
   // update files[file] here 
})

the req.getParameter(0) takes a very small amount of time and the if statements are instant, but you are using a tiny amount of memory/no extra functions being created and destroyed. I think being able to update the http routes is potentially helpful but I don't know if creating unique functions for every possible file and state of those files is good performance. Also reading that arrow functions are same performance as normal functions

lostrepo commented 3 years ago

@hst-m

  1. A bit less than 500 'static' routes in test VM. Maybe I'll make 50k tests later.
  2. Using v8 flags to deal with dumb GC and scheduled+manual calls to it in code.
  3. As I wrote before I prefer to trade memory for better hot path performance.
  4. Good, will research how they optimized arrow functions and probably test how it performs in my case.

I think 'me' gone off topic here already.

ghost commented 3 years ago

I don't care about this support for performance reasons, the idea I have is better integration with the SNI support. I have opened a new issue for that.