Closed openhpi2 closed 8 years ago
Original comment by: dr_mohan
The above error messages seems to be coming from different location than where the code changes are. Some of the error messages are not showing up when the process is killed. There are few questions on the patch. Bugs are usually fixed on the trunk.
Original comment by: dr_mohan
I wouldn't like call my changes in the source as "patch". It is rather workaround. The problem is persistent appearance of the error messages during openhpid restart. So I need patch which doesn't show error messages only when openhpi daemon is stopping. To manage it I added error recognition in cStreamSock::ReadMsg() from strmsock.cpp if peer closed connection in ahead of server finished reading (via payload_len and such decision is an ugly hack). Also I removed critical error if the one appears because of timeout. Timeout error processing was moved to server.cpp. I hope it is answer for first, second and partly fourth questions. Regarding question 3: O, yes, in service_thread(). I forgot to delete unused declaration. But in oh_server_run() this variable is used.
Original comment by: vladimir-pavlov
1) Tested the patch by running the clients on different machine rather than where the openhpid is running. Without the patch we get the two errors transport: CRIT: strmsock.cpp:558: accept failed. and transport: CRIT: strmsock.cpp:296: error while sending message. 2) When I applied the patch and tested it, the first error(transport: CRIT: strmsock.cpp:558: accept failed) goes off but the second one(transport: CRIT: strmsock.cpp:296: error while sending message) appears sometime. Second one does not appear all the time but it comes sometime. 3) Variable 'wc' is declared in service_thread() but not used in this function, we can remove this declaration from the patch. 4) The message [CRIT( "select failed" );] has been removed , I feel we can retain this one.
Original comment by: openhpi2
We will checkin the attached patch, if it is ok. Please review 1834_July_04_2014.patch.txt
Original comment by: dr_mohan
Thanks for the patch! Several comments and questions:
o) Just whining: let's have {} even for a single line if construct!
o) Are we trying to handle a transient accept error?
o) What if a client drops connection attempt between Wait and Accept? Looks like a race condition.
o) "Select failed" logging in server.cpp may be confusing. We call sock->Wait() and wait potentially can be implemented with another mechanism.
o) Assigning "payload_len = (uint32_t)(-1)" looks dangerous. Can we assign it to zero instead?
Original comment by: avpak
Thanks Anton for reviewing the patch and comments.
We will take of care of first and last point in the patch.
For second point, Indeed Vladimir is trying to avoid two error messages as described in the bug.
Regarding third point, we have not seen failure of Accept() call so far. We will have a check with Vladimir if he has faced any time Accept() call failure. And in case Aceept() call does not fail there will be no race condition. In case of Accept() call fails, could you please suggest us that either should we continue in the loop or break out of the loop.
On the fourth point, could you please give us some idea on implementation of different mechanism for Wait() instead of calling sock->Wait() and wait?
Original comment by: openhpi2
Anton,
The attached is the modified patch for the defect. Looks like the higher level error message could be completely removed. We have not seen the accept call failure though 1834_accept_failed_jul_29.patch.txt
Original comment by: dr_mohan
Mohan, thanks!
My points:
Original comment by: avpak
Anton,
Please find the attached modified patch which has addressed most of your above mentioned points. 1834_accept_failed_aug_14.patch.txt
Original comment by: openhpi2
Thanks! I have only few points:
Original comment by: avpak
Anton,
I have attached the modified patch(which addresses all of your above points). 1834_accept_failed_aug_22.patch.txt
Original comment by: openhpi2
Cool! Now I have only two comments.
1) please use spaces instead of tabs for indentation in "Error accepting server socket." block.
2) In the block: g_usleep( 1000000 ); if (stop) { break; }
let's have "stop" check first and then g_usleep.
Original comment by: avpak
Thanks Anton for comments.
I have addressed the above two points in the attached patch. 1834_accept_failed_aug_25.patch.txt
Original comment by: openhpi2
Great! Looks perfect for me.
Original comment by: avpak
Original comment by: hemanthreddy
Fixed in revision 7588.
Original comment by: hemanthreddy
Original comment by: dr_mohan
When openhpi service is restarted, the below errors are observed in syslog: "openhpid: transport: strmsock.cpp:558: accept failed." and "openhpid: transport: strmsock.cpp:296: error while sending message"
The error messages go away if attached changes are applied to the sources.
Attachments: some.diff.txt
Reported by: vladimir-pavlov