noodlefrenzy / node-amqp10

amqp10 is a promise-based, AMQP 1.0 compliant node.js client
MIT License
134 stars 56 forks source link

can not read property sendframe of undefined #237

Closed hamzaanjum closed 8 years ago

hamzaanjum commented 8 years ago

I am using the Azure IoT NodeJS Sdk, and the receiver is crashing my application while attempting to reattach. It crashes after about 2 days of running. I have even added manual refreshing of the amqp client every 10 minutes and disconnecting the old one, and now it managed to run for about 5 days, but then crashed with the same error.

node error

I am using the exact code from this tutorials https://azure.microsoft.com/en-us/documentation/articles/iot-hub-node-node-getstarted/#receive-device-to-cloud-messages

mbroadst commented 8 years ago

@hamzaanjum hey, sorry you're having this issue! We have notoriously had issues with the reconnect/reattach code because they are incredibly difficult to track down. It would be incredibly helpful if it was possible for you to run a session with debugging information enabled (DEBUG=amqp* as an environment variable) and try to isolate the conditions which caused the connection to fail, and thus lead to this particular error.

hamzaanjum commented 8 years ago

It has been almost 2 weeks that the service is running on debug mode without any issues, using the following command on command prompt (app is built on express framework). node debug ./bin/www

Waiting for it to crash before I run the session with your asked way of enabling debugging information

mbroadst commented 8 years ago

@hamzaanjum yeah, as you can see these issues are incredibly difficult to pinpoint as they happen to infrequently (and only with Azure to boot). Please keep us up to date if/when you run into this again, we certainly want to make sure it doesn't happen again.

hamzaanjum commented 8 years ago

@mbroadst is there a way to write all this log with debugging information enabled (DEBUG=amqp* as an environment variable) to a file? I have tried the npm start >> outputfile command, but it only writes those lines, that I am logging. Because I am on a windows machine, and I can't scroll much on command prompt, and only the smallest fraction of the log would be retrievable at the end.

mbroadst commented 8 years ago

@hamzaanjum in looking at this article https://support.microsoft.com/en-us/kb/110930 it looks like your windows shell supports general posix formatting for this. Try out npm start 1> outputfile 2>&1

hamzaanjum commented 8 years ago

@mbroadst yeah, it works. Though it would be a pretty huge file by the time it crashes :D

mbroadst commented 8 years ago

@hamzaanjum unfortunately yeah 😳 I can give you an address to upload the file to, it should zip down to a reasonable size? You might also be able to determine yourself what parts of the log are relevant as they should differ around the error site.

hamzaanjum commented 8 years ago

@mbroadst I'll surely upload the file, if I won't be able to figure out the problem. But right now, the main problem is, that last time it crashed after almost 3.5 weeks :(

hamzaanjum commented 8 years ago

Here is the last section of the log just before the crash, before this, the heartbeat was working just fine.

Wed, 06 Jul 2016 22:51:54 GMT amqp10:connection Rx: 0000000802000000
Wed, 06 Jul 2016 22:51:54 GMT amqp10:framing:reader Heartbeat frame: 0000000802000000
Wed, 06 Jul 2016 22:51:55 GMT amqp10:connection Rx: 0000000802000000
Wed, 06 Jul 2016 22:51:55 GMT amqp10:framing:reader Heartbeat frame: 0000000802000000
Wed, 06 Jul 2016 22:51:55 GMT amqp10:connection Rx: 0000000802000000
Wed, 06 Jul 2016 22:51:55 GMT amqp10:framing:reader Heartbeat frame: 0000000802000000
Wed, 06 Jul 2016 22:51:57 GMT amqp10:connection Rx: 0000000802000000
Wed, 06 Jul 2016 22:51:57 GMT amqp10:framing:reader Heartbeat frame: 0000000802000000
Wed, 06 Jul 2016 22:51:57 GMT amqp10:connection Rx: 0000000802000000
Wed, 06 Jul 2016 22:51:57 GMT amqp10:framing:reader Heartbeat frame: 0000000802000000
Wed, 06 Jul 2016 22:51:57 GMT amqp10:connection Rx: 0000001102000000005316c00403434140
Wed, 06 Jul 2016 22:51:57 GMT amqp10:session processing frame:  @detach(22) [handle=0 closed=true error=null]
Wed, 06 Jul 2016 22:51:57 GMT amqp10:framing sending frame:  @detach(22) [handle=0 closed=true error=null] : 0000001102000001005316c00403434140
Wed, 06 Jul 2016 22:51:57 GMT amqp10:link iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}:{partitionid}: Transitioning from ATTACHED to DETACHING due to detachReceived
Wed, 06 Jul 2016 22:51:57 GMT amqp10:link iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}:{partitionid}: Transitioning from DETACHING to DETACHED due to detached
Wed, 06 Jul 2016 22:51:57 GMT amqp10:session detached(iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}): null
Wed, 06 Jul 2016 22:51:57 GMT amqp10:link iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}:{partitionid}: Transitioning from DETACHED to REATTACHING due to reattach
Wed, 06 Jul 2016 22:51:57 GMT amqp10:link Attempting to reattach iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}
Wed, 06 Jul 2016 22:51:57 GMT amqp10:link attach CH=1, Handle=0
Wed, 06 Jul 2016 22:51:57 GMT amqp10:framing sending frame:  @attach(18) [name='iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}' handle=0 role=true sndSettleMode=2 rcvSettleMode=0 source=@source(40) [address='iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/0' durable=0 expiryPolicy='session-end' timeout=0 dynamic=false dynamicNodeProperties={} distributionMode=null filter={ 'apache.org:selector-filter:string': 
   DescribedType {
     descriptor: 'apache.org:selector-filter:string',
     value: 'amqp.annotation.x-opt-enqueuedtimeutc > 1467845357283' } } defaultOutcome=null outcomes=null capabilities=null] target=@target(41) [address='localhost' durable=0 expiryPolicy='session-end' timeout=0 dynamic=false dynamicNodeProperties={} capabilities=null] unsettled={} incompleteUnsettled=false initialDeliveryCount=1 maxMessageSize=10000 offeredCapabilities=null desiredCapabilities=null properties={}] : 000001b002000001005312d0000001a00000000ea170696f746875622d656875622d636c6f7564676174652d31343834382d343362616534356566362f436f6e73756d657247726f7570732f2464656661756c742f506172746974696f6e732f305f35623133633332382d323335302d343236302d386666662d343132653036396363306138434150025000005328c0e60ba14b696f746875622d656875622d636c6f7564676174652d31343834382d343362616534356566362f436f6e73756d657247726f7570732f2464656661756c742f506172746974696f6e732f3043a30b73657373696f6e2d656e644342c1010040c17f02a3216170616368652e6f72673a73656c6563746f722d66696c7465723a737472696e6700a3216170616368652e6f72673a73656c6563746f722d66696c7465723a737472696e67a135616d71702e616e6e6f746174696f6e2e782d6f70742d656e71756575656474696d65757463203e2031343637383435333537323833404040005329c02007a1096c6f63616c686f737443a30b73657373696f6e2d656e644342c1010040c101004252018000000000000027104040c10100
Wed, 06 Jul 2016 22:51:58 GMT amqp10:connection Rx: 0000000802000000
Wed, 06 Jul 2016 22:51:58 GMT amqp10:framing:reader Heartbeat frame: 0000000802000000
Wed, 06 Jul 2016 22:51:58 GMT amqp10:connection Rx: 0000001202000000005316c00503520141400000000f02000000005317c00201400000000f02000000005318c0020140
Wed, 06 Jul 2016 22:51:58 GMT amqp10:session processing frame:  @detach(22) [handle=1 closed=true error=null]
Wed, 06 Jul 2016 22:51:58 GMT amqp10:framing sending frame:  @detach(22) [handle=1 closed=true error=null] : 0000001202000001005316c0050352014140
Wed, 06 Jul 2016 22:51:58 GMT amqp10:link iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}:{partitionid}: Transitioning from ATTACHED to DETACHING due to detachReceived
Wed, 06 Jul 2016 22:51:58 GMT amqp10:link iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}:{partitionid}: Transitioning from DETACHING to DETACHED due to detached
Wed, 06 Jul 2016 22:51:58 GMT amqp10:session detached(iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}): null
Wed, 06 Jul 2016 22:51:58 GMT amqp10:link iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}:{partitionid}: Transitioning from DETACHED to REATTACHING due to reattach
Wed, 06 Jul 2016 22:51:58 GMT amqp10:session processing frame:  @end(23) [error=null]
Wed, 06 Jul 2016 22:51:58 GMT amqp10:session Transitioning from MAPPED to END_RCVD due to endReceived
Wed, 06 Jul 2016 22:51:58 GMT amqp10:session Transitioning from END_RCVD to UNMAPPED due to sendEnd
Wed, 06 Jul 2016 22:51:58 GMT amqp10:framing sending frame:  @end(23) [error=null] : 0000000f02000001005317c0020140
Wed, 06 Jul 2016 22:51:58 GMT amqp10:session Session unmapped - force-detaching all links.
Wed, 06 Jul 2016 22:51:58 GMT amqp10:client unmapped
Wed, 06 Jul 2016 22:51:58 GMT amqp10:connection Transitioning from OPENED to CLOSE_RCVD due to closeReceived
Wed, 06 Jul 2016 22:51:58 GMT amqp10:framing sending frame:  @close(24) [error=null] : 0000000f02000000005318c0020140
Wed, 06 Jul 2016 22:51:58 GMT amqp10:client disconnected
Wed, 06 Jul 2016 22:51:58 GMT amqp10:connection Transitioning from CLOSE_RCVD to DISCONNECTED due to sendClose
Wed, 06 Jul 2016 22:51:58 GMT amqp10:link Attempting to reattach iothub-ehub-{hubname}-{someids}/ConsumerGroups/$default/Partitions/{somepartitionrelatedid}
Wed, 06 Jul 2016 22:51:58 GMT amqp10:link attach CH=undefined, Handle=1
C:\node\solosync\node_modules\amqp10\lib\link.js:107
  this.session.connection.sendFrame(attachFrame);
                         ^

TypeError: Cannot read property 'sendFrame' of undefined
    at ReceiverLink.Link.attach (C:\node\solosync\node_modules\amqp10\lib\link.js:107:26)
    at ReceiverLink.Link._attemptReattach (C:\node\solosync\node_modules\amqp10\lib\link.js:231:8)
    at Timeout._onTimeout (C:\node\solosync\node_modules\amqp10\lib\link.js:219:10)
    at tryOnTimeout (timers.js:224:11)
    at Timer.listOnTimeout (timers.js:198:5)

npm ERR! Windows_NT 6.3.9600
npm ERR! argv "C:\\Program Files\\nodejs\\node.exe" "C:\\Program Files\\nodejs\\node_modules\\npm\\bin\\npm-cli.js" "start"
npm ERR! node v5.10.0
npm ERR! npm  v3.8.3
npm ERR! code ELIFECYCLE
npm ERR! solosync@0.0.0 start: `node ./bin/www`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the solosync@0.0.0 start script 'node ./bin/www'.
npm ERR! Make sure you have the latest version of node.js and npm installed.
npm ERR! If you do, this is most likely a problem with the solosync package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     node ./bin/www
npm ERR! You can get information on how to open an issue for this project with:
npm ERR!     npm bugs solosync
npm ERR! Or if that isn't available, you can get their info via:
npm ERR!     npm owner ls solosync
npm ERR! There is likely additional logging output above.

npm ERR! Please include the following file with any support request:
npm ERR!     C:\node\solosync\npm-debug.log
mbroadst commented 8 years ago

@hamzaanjum thank you very much for the report! I think I see where the issue is in our code, but it still seems like the remote side shouldn't be disconnecting us this way. I'll try to put together a unit test later to emulate what happened here and hopefully we can close this once and for all.

mbroadst commented 8 years ago

fixed in 3a7272d307f3ef19ff01ccbfe1db1536b627592e, and released in 3.2.2

hamzaanjum commented 8 years ago

thanks, updated the sdk, and started the service again. Let's test it for a final time :)

mbroadst commented 8 years ago

@hamzaanjum I hope so! Thanks for sticking with us on this one, it was a particularly nasty race condition. I'll leave the ticket open and you can close it when you verify the fix works

hamzaanjum commented 8 years ago

@mbroadst Its been a full month, and the service is still running, so I guess it won't cause any trouble in the future. Thanks ...