Closed jprochazk closed 3 years ago
just to simplify the problem here is minimal example file
const {Worker,isMainThread}=require('worker_threads')
if(isMainThread){
const worker=new Worker(__filename)
worker.on('error',e=>console.log('ERROR',e))
setInterval(()=>console.log('main is running'),1000)
}else{
require('uWebSockets.js').App()
//.listen(3001,s=>{})
asd()
}
That works correctly, the last line throws error and the main thread catches it and continues running and the worker exits. But if you uncomment the Listen line, throwing the error after listening causes "uv loop open handles" process crash. Also replacing asd()
with process.exit()
does the same thing
Ok in the previous issue https://github.com/uNetworking/uWebSockets.js/issues/324 they called uws.us_listen_socket_close(socket)
before closing the thread which prevents the error, so for this issue to fix the problem you need to add in the worker thread (not the main thread):
process.on('uncaughtException',e=>{
uws.us_listen_socket_close(listenSocket)
wsSet.forEach(ws=>ws.close())
process.exit(1)
})
See https://github.com/uNetworking/uWebSockets.js/discussions/575 to close websockets
Can this be fixed so the user does not have to add this in? AddEnvironmentCleanupHook
I don't really see this as an issue - if you leave sockets open and the process exists you are going to have polls remaining in the event loop. It can be solved by adding us_socket_context_close that simply just nukes all open sockets (listen sockets, sockets, client sockets) instantly. But then you get a lot of close events emitted during closedown. Maybe that is cleaner I don't know.
App.close() can call us_socket_context_close on all socket contexts and that App.close() can be called from JS inside that uncaughtException but really - why even catch uncaughtException? It's clearly an issue in the code and should "crash".
Unhandled exceptions are errors in code. In statically typed languages you can't even compile a program that doesn't catch exceptions (Java for instance). Sure, C++ defaults to aborting but even here you have noexcept functions that cannot throw so the idea of properly handling exceptions are widespread. JavaScript being a script can do anything it pleases at any point in time so of course there are no strict rules, but it still is a programming flaw.
To make it clear- I like that you have examples that can show the issue. I like that. But this is a cosmetic thing in my opinion.
I agree that unhandled exceptions are programmer mistakes. In case the worker runs into such an exception, the process is going to exit anyway, because there's probably no good way to recover. It'd be better not to crash on exit, but it's not a deal-breaker by any means.
App.close() can call us_socket_context_close on all socket contexts and that App.close() can be called from JS inside that uncaughtException
I thought you could add whatever is needed inside AddEnvironmentCleanupHook instead of uncaughtException, closing whatever sockets are open etc
Probably so, yes
App.close() can call us_socket_context_close on all socket contexts and that App.close() can be called from JS inside that uncaughtException
I thought you could add whatever is needed inside AddEnvironmentCleanupHook instead of uncaughtException, closing whatever sockets are open etc
@hst-m I find using cluster instead of workerthreads a viable option. In you sample code from #324 only one worker is used by the cluster. When I tried to add two or more the rest exits with error Error: write EPIPE. Do you have an example where you can have a lot workers?
Catching exceptions inside workers is good practice but I like to learn more about using cluster vs workerthreads. @hst-m what do you like about cluster?
here is same thing in each:
Cluster:
const cluster=require('cluster')
if(cluster.isMaster){
const shutdown=sig=>{}
process.on('SIGINT',shutdown);process.on('SIGTERM',shutdown)
require('os').cpus().forEach(c=>cluster.fork())
}else{
let listenSocket,uws=require('uWebSockets.js'),port=3000
uws.App().get('/',res=>res.end(`worker ${cluster.worker.id}`))
.listen(port,s=>{listenSocket=s;s&&console.log(`worker ${cluster.worker.id} listening on ${port}`)})
const shutdown=sig=>{
uws.us_listen_socket_close(listenSocket)
cluster.worker.kill()
}
process.on('SIGINT',shutdown);process.on('SIGTERM',shutdown)
}
Worker Thread:
const {Worker,isMainThread,threadId,parentPort}=require('worker_threads')
if(isMainThread){
const shutdown=sig=>workers.forEach(w=>w.postMessage('shutdown'))
process.on('SIGINT',shutdown);process.on('SIGTERM',shutdown)
const workers=require('os').cpus().map(c=>new Worker(__filename))
}else{
let listenSocket,uws=require('uWebSockets.js'),port=3000
uws.App().get('/',res=>res.end(`thread ${threadId}`))
.listen(port,s=>{listenSocket=s;s&&console.log(`thread ${threadId} listening on ${port}`)})
parentPort.on('message',message=>{
if(message=='shutdown'){
uws.us_listen_socket_close(listenSocket)
process.exit(0)
}
})
}
@hst-m Thanks. It looks good. I need to understand differences between cluster and the worker-threads. I have a feeling cluster is more mature but not sure.
Cluster is processes workers are threads. With workers you have only one process so a crash ends all workers
I think the main benefit of threads are they are lighter to create and can share data between them faster. So maybe if you are sending lots of data between you should use threads. Just know a crash will crash the whole thing, though you can catch errors in JS to prevent it crashing in either main or worker thread
I'm using redis to distribute data between threads and servers. One error keeps happening at night when the server is running under small load. Basically connected clients are sending pings only.
Error in master Error [ERR_UNHANDLED_ERROR]: Unhandled error. ('Invalid access of closed uWS.WebSocket/SSLWebSocket.')
at Worker.emit (events.js:304:17)
at Worker.EventEmitter.emit (domain.js:467:12)
at Worker.[kOnErrorMessage] (internal/worker.js:251:10)
at Worker.[kOnMessage] (internal/worker.js:261:37)
at MessagePort.
In the test environment, this unhandled exception is caught in master. Not sure why the socket closes as the ping is used to keepAlive the sockets.
uv loop at [0x7000124c0af8] has open handles:
[0x105765960] poll (active)
Close callback: 0x0
Data: 0x0
[0x105653350] poll (active)
Close callback: 0x1055561a0 close_cb_free [.../node_modules/uWebSockets.js/uws_darwin_x64_83.node]
Data: 0x0
[0x10474d260] poll (active)
Close callback: 0x0
Data: 0x0
uv loop at [0x7000124c0af8] has 3 open handles in total
node[42120]: ../src/debug_utils.cc:322:void node::CheckedUvLoopClose(uv_loop_t ): Assertion `0 && "uv_loop_close() while having open handles"' failed.
1: 0x1012d96a5 node::Abort() (.cold.1) [/usr/local/bin/node]
2: 0x1000a6309 node::Abort() [/usr/local/bin/node]
3: 0x1000a6171 node::Assert(node::AssertionInfo const&) [/usr/local/bin/node]
4: 0x10003117e node::CheckedUvLoopClose(uv_loop_s) [/usr/local/bin/node]
5: 0x1001457c6 node::worker::WorkerThreadData::~WorkerThreadData() [/usr/local/bin/node]
6: 0x1001421a1 node::worker::Worker::Run() [/usr/local/bin/node]
7: 0x100145952 node::worker::Worker::StartThread(v8::FunctionCallbackInfo
@hst-m What is the best way to send data between threads? If all the clients disconnect I assume the workers would crash, correct? If my assumption is correct, it would be nice to have a dummy client in master that connects to each worker and keeps it alive even though there are no real clients. If a browser is unresponsive it will stop sending pings and this might close all sockets and create these exceptions.
What is the best way to send data between threads?
Read all of worker threads page, there is 2-way message channels and new broadcast channels, also SharedArrayBuffer
One error keeps happening 'Invalid access of closed uWS.WebSocket' If all the clients disconnect I assume the workers would crash, correct?
You need to attach event handler to ws.close event that stops code from trying to access websocket after it closes. There is no reason for crash if clients disconnect
@hst-m Thank you. I will read everything first.
As for the crash, right now I have an empty close handler:
).ws('/*', {
....
,close: (ws, code, message) => {
}
})
Should I add inside the close handler something like ws.isClosed = true and then check before using ws if ws.isClosed?
Should I add inside the close handler something like ws.isClosed = true and then check before using ws if ws.isClosed?
You could but keeping references to the ws
object will keep it in memory wasting resources when really you should be deleting any references to the ws object and let it garbage collect, remove it from whatever list you are using to track ws objects and just not call it anymore. You must be tracking the ws object in some list so remove it from list instead of setting ws.isClosed=true. If you need to keep it around temporarily for some reason then is ok
@hst-m I'm not keeping any references to ws. I use app.publish to send data periodically inside setInterval. I use only ws when client connects inside open handler to send the initial data. Clients are sending data and app.publish is used again to broadcast right away. I'm assuming the socket closes and app.publish attempts to send data on the closed socket.
No app.publish would not throw that error Invalid access of closed uWS.WebSocket
so you must be doing something else. Just above that error log you posted should be the exact code that caused error, look up that code and see what it is doing, you cut it out from your post
Maybe catching in Master thread is preventing showing the code that caused it, remove the master thread error catch and let it throw in worker? Because your error looks different like its caught in master and not showing error like my test, unless you cut out the code part
Okay let me try it. I will remove the error handler in the master and try again. The issue happens at night not sure how to reproduce it.
This is the handler in master: process.on('uncaughtException', (e)=>{ console.log("Error in master", e, new Date().toString())});
Before I added the error handler in master I just got this error in worker:
uv loop at [0x7000124c0af8] has open handles:
[0x105765960] poll (active)
Close callback: 0x0
Data: 0x0
[0x105653350] poll (active)
Close callback: 0x1055561a0 close_cb_free [.../node_modules/uWebSockets.js/uws_darwin_x64_83.node]
Data: 0x0
[0x10474d260] poll (active)
Close callback: 0x0
Data: 0x0
uv loop at [0x7000124c0af8] has 3 open handles in total
node[42120]: ../src/debug_utils.cc:322:void node::CheckedUvLoopClose(uv_loop_t ): Assertion `0 && "uv_loop_close() while having open handles"' failed.
1: 0x1012d96a5 node::Abort() (.cold.1) [/usr/local/bin/node]
2: 0x1000a6309 node::Abort() [/usr/local/bin/node]
3: 0x1000a6171 node::Assert(node::AssertionInfo const&) [/usr/local/bin/node]
4: 0x10003117e node::CheckedUvLoopClose(uv_loop_s) [/usr/local/bin/node]
5: 0x1001457c6 node::worker::WorkerThreadData::~WorkerThreadData() [/usr/local/bin/node]
6: 0x1001421a1 node::worker::Worker::Run() [/usr/local/bin/node]
7: 0x100145952 node::worker::Worker::StartThread(v8::FunctionCallbackInfo
Before these errors I only see database error which is handled in connection.on('error',()=>{}). The called to db is made from inside the ws.message handler. It could be that a request to db never completed and and some point that error is generated however I have try-catch around the db call.
ok I just set up similar test to try this, I can't get it to print the exact code location causing Invalid access
error from inside the worker thread, but just know you are doing something invalid with ws object after it closed, should be easy to find, post full mini example code here if you need help. It would help if this issue of open sockets/handles would be fixed so it doesnt crash all the other threads AddEnvironmentCleanupHook
You could remove the master thread and run your worker as a single thread and it should show you the exact code throwing the invalid access error
@hst-m Thank you.I will try tonight to find if the same error occurs. I removed the handler in master. Hopefully it will show where it's coming from. If not, I will start digging. It looks like mysql pool connection is throwing errors and somehow creates the crash. Maybe callback arrives so late that ws is closed. Not sure how to create a small test for this as it has a lot of pieces. ws.open is using ws, ws.message uses db&ws to fetch and return data and redis publish. Redis subscriber uses app.publish.
@hst-m Good point. I'll do that just use one thread try to find where it is.
@hst-m How do you handle the case where you have a really long callback and data comes back after ws closed?
I guess that would be a case where ws.isClosed=true
in the ws.close handler would work, so that you don't do ws.send()
or any other ws function when the callback returns
@hst-m Yeah I thought this could be the fix. I'm running bunch of stuff and I guess sometimes a request is dropped. I can definitely hear the cpu running :)
@hst-m I have my hopes up will know tomorrow. Thanks again.
Should I add inside the close handler something like ws.isClosed = true and then check before using ws if ws.isClosed?
Philosophy of JavaScript - don't care about garbage, throw it on the ground until the world collapses.
Pilosophy of C++ - take care of garbage the moment it becomes garbage to avoid wading knee deep in shit:
Nice pictorial illustrations but we are talking about a split second between when the async call gets made and when it returns while the ws
object is still in memory, not too much garbage to complain about, also the ws object might still be needed when the async returns so no point collecting it. Maybe if your C++ actually cleaned up its sockets/handles when the thread exited we wouldn't have this issue open
Just a little more won't do any harm 😂
Here is a picture of your C++ after not cleaning up sockets and handles and crashing every other thread in the program
Hahaha damn boy
I think my preferred setup right now is use both Cluster and Worker Threads, because I use Cluster to manage and launch different versions of app with 1 Process/version running at a time, leave old process running until new process is ready, then each process can launch Worker threads if needed for each cpu, then if something crashes I can still have Cluster launch a new process
Impressive pics I love the last one as it represents sophisticated infrastructure. Great setup!
ws.isClosed saved the day!
ws.message uses db&ws to fetch and return data
ya this part you said message event does some async then sends back data ws.send()
so anything async like that where the ws could have closed you would need to do ws.ab=true
in the close handler and then check it !ws.ab&&ws.send()
OR do a try/catch block to catch the ws.send error try{ws.send()}catch(e){}
Yep. I like to avoid try and catch as much as possible in early development and then add try&catch at the end around bigger pieces to take into account everything else the code could've missed.
One other thingy is that I find client disconnects during night although I send ping every 50 seconds and keepAlive is 60. Should the server be sending pong back to keep the client alive? If let's say client script because unresponsive is there a way to refresh it like window reload?
I'm trying to use the library in a worker thread, but I can't seem to do a graceful shutdown when the worker dies due to an uncaught exception
And
Yep. I like to avoid try and catch as much as possible in early development
Might be related? No? I would say go fix your app and don't expect software to magically solve your mistakes "gracefully".
@alexhultman Yeah absolutely looking from js perspective.
Might be related? No? I would say go fix your app and don't expect software to magically solve your mistakes "gracefully".
you quoted 2 different people :P
Haha that perspective is dominating :)
Kidding aside js is a cookie eater. I'm thinking to switch to C++ for version 2.
Ideally you don't have any errors but if the thread exits it would be nice to not drag the process down with it. I did more testing and doing as mentioned above uws.us_listen_socket_close(listenSocket)
won't prevent crash, after websocket opens there is still some handle/poll open. uWS could clean up whatever is open in AddEnvironmentCleanupHook but maybe that is difficult to do, I don't know how it works
I think call to ws.send should just be dropped if for broken handles but yeah it's maybe it would look like spaghetti inside C++
This issue does not require any errors, if you just want to exit thread with process.exit(0)
you can't do it without a crash, after websockets have been opened etc. So this talk about handling errors is not relevant, still need a way to clean up thread if you want to exit
I'm trying to use the library in a worker thread, but I can't seem to do a graceful shutdown when the worker dies due to an uncaught exception. I've tried as many ways to handle this case as I could find, I'm not sure if this is a mistake on my end, maybe I'm missing something.
I'm using Node 15.10.0, and uWS.js 18.14.0 on Win10, here's the code:
The call to
performace.now()
is not valid, becauseperformance
must be imported fromperf_hook
, so it will throw a ReferenceError. The uncaught exception is handled bycleanup
, but at this point, the worker is already terminated, so it can't respond to thecleanup
message.Here's the output:
I looked through the issues, and it is similar to #324. Could this be the same problem?