saleyn / eixx

Erlang C++ Interface
Apache License 2.0
137 stars 26 forks source link

basic_otp_mailbox never calls handler #52

Open heri16 opened 3 years ago

heri16 commented 3 years ago

https://github.com/saleyn/eixx/blob/8b778bdc385882b48b6edb9ed257ac62cf487a18/src/test_node.cpp#L168-L175

on_msg and on_io is never called even with commit https://github.com/saleyn/eixx/commit/5c0baa3b820b07fca3a5880d972445aee38b46c1

saleyn commented 3 years ago

Yes, apparently something got broken with message decoding in recent commits. While looking at it briefly (running test-node with -v trace, I noticed that the decoded Pids and Refs have a very large creation number. This is probably a bug some place.

heri16 commented 3 years ago

From some debugging...

When there is an inbound message from the network, m_queue.push(data) and m_timer.cancel(ec) is called but the data is never dequeue or processed.

https://github.com/saleyn/eixx/blob/8b778bdc385882b48b6edb9ed257ac62cf487a18/include/eixx/util/async_queue.hpp#L154-L164

saleyn commented 3 years ago
$ erl -sname a &
$ build/src/test-node -n b -r a@zeos -v message
INFO   | SEND cntrl={6,#Pid<b@zeos.2.0,1>,'',rex}, msg={#Pid<b@zeos.2.0,1>,{call,erlang,now,[],user}}
INFO   | SEND cntrl={6,#Pid<b@zeos.1.0,1>,'',rex}, msg={'$gen_cast',{cast,io,put_chars,["This is a test string"],#Pid<b@zeos.1.0,1>}}
INFO   | SEND cntrl={6,#Pid<b@zeos.1.0,1>,'',rex}, msg={'$gen_cast',{cast,io,put_chars,["DONE"],#Pid<b@zeos.1.0,1>}}
INFO   | Connected to node: a@zeos
INFO   | Got transport msg - (msg):   {rex,{1630,329893,608668}}
INFO   | Got transport msg - (msg):   {io_request,#Pid<a@zeos.104.0,1630329788>,#Ref<a@zeos.65621.731906052.1307459710,1630329788>,{put_chars,<<"This is a test string">>}}
INFO   | Got transport msg - (msg):   {io_request,#Pid<a@zeos.105.0,1630329788>,#Ref<a@zeos.65622.731906052.1307459710,1630329788>,{put_chars,<<"DONE">>}}

Note that the Pids and Refs have very large creation 1630329788. I believe this is a decoding bug some place with the new changes (or maybe initialization bug?). Consequently this causes messages to be missed.

heri16 commented 3 years ago

I've tried going back in time to last month's commit https://github.com/saleyn/eixx/commit/74c3107134982c9111294290229028a93cb6e42d but it still doesn't work for me. Seems like an old issue just rediscovered?

heri16 commented 3 years ago

I'll proceed to test commit https://github.com/saleyn/eixx/commit/c5b3d8f6580f054070b93147abde762577be3998 as it looks like that was the last time encoding changes were made prior to https://github.com/saleyn/eixx/commit/74c3107134982c9111294290229028a93cb6e42d

heri16 commented 2 years ago

Was able to reproduce this bug with commit https://github.com/saleyn/eixx/commit/c5b3d8f6580f054070b93147abde762577be3998

@saleyn Seems like a longstanding issue?

saleyn commented 2 years ago

Apparently it got broken at some point in the past. Maybe due to upgrading boost or some other bug. Will need to explore it more when I have time. Though unlikely soon. :(