noodlefrenzy / node-amqp10

amqp10 is a promise-based, AMQP 1.0 compliant node.js client
MIT License
134 stars 56 forks source link

stream is null in frames.js/writeFrame/line56 #303

Open sebastus opened 7 years ago

sebastus commented 7 years ago

My client has multiple ReceiverLinks to a bunch of event hub/partitions. After listening and getting 5 seconds of silence across all hubs (each incoming message resets a 5 second timeout), I enumerate the clients and delete them. Race condition? In any case, 'stream' is unprotected/tested in this method before trying to write to it.

sebastus commented 7 years ago

Found a workaround. I was getting and printing the rx_err during the error event handler of all receiver links. (receiver.on('errorReceived', myErrorHandler.bind(...))) When I removed this access of rx_err, the problem with stream being null does not re-occur.

sebastus commented 7 years ago

scratch that. it happens less often, but it still happens.

sebastus commented 7 years ago

FYI, I had a timer that was firing at odd times and deleting clients.

sebastus commented 7 years ago

Even with clearing up the timer problem, the error occurred once in a couple of trials. I'm listening to 19x4 partitions.

sebastus commented 7 years ago

Here's the stack dump. It's exactly the same every time. It happens once per test run, where I delete my checkpoints and listen to the 19 hubs. It happens about the same clock time each time.

C:\github\SplunkAddonForAzureMonitorLogs\bin\app\node_modules\amqp10\lib\frames.js:56
  stream.write(buffer, callback);
        ^

TypeError: Cannot read property 'write' of null
    at Object.frames.writeFrame (C:\github\SplunkAddonForAzureMonitorLogs\bin\app\node_modules\amqp10\lib\frames.js:56:9)
    at Connection.sendFrame (C:\github\SplunkAddonForAzureMonitorLogs\bin\app\node_modules\amqp10\lib\connection.js:328:10)
    at ReceiverLink.Link.attach (C:\github\SplunkAddonForAzureMonitorLogs\bin\app\node_modules\amqp10\lib\link.js:136:27)
    at ReceiverLink.Link._attemptReattach (C:\github\SplunkAddonForAzureMonitorLogs\bin\app\node_modules\amqp10\lib\link.js:251:8)
    at Timeout._onTimeout (C:\github\SplunkAddonForAzureMonitorLogs\bin\app\node_modules\amqp10\lib\link.js:239:10)
    at tryOnTimeout (timers.js:232:11)
    at Timer.listOnTimeout (timers.js:202:5)
mbroadst commented 7 years ago

@sebastus seems like it could be related to #295 . I have a branch issue295 where I've committed some potential fixes, if you'd care to test that branch out it might help you too. Without more information (minimal reproducible test case, or debug logs) there's little I can do to help at this point.

sebastus commented 7 years ago

I'm happy to provide my source. I don't know if it qualifies as "minimal". I'll give that branch a try, though this really isn't blocking me. I just retry and it usually works.