roshbaik2 / open-zwave

Automatically exported from code.google.com/p/open-zwave
0 stars 0 forks source link

OZW goes into 100% CPU load if the Aeon USB Stick 2 is unplugged (only restarts helps) #111

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Start OZW e.g. with MinOZW and get the devices discovered
2. Unplug the USB Stick
3. Check with "top" and the MinOZW is consuming 100% CPU load

What is the expected output? What do you see instead?
Not 100% CPU load

What version of the product are you using? On what operating system?
Unbuntu 10.04
Open Z-Wave r556

Please provide any additional information below.

Original issue reported on code.google.com by uAle...@gmail.com on 3 Nov 2012 at 11:37

GoogleCodeExporter commented 9 years ago
Peter made a Linux only patch that I modularized so it could be stubbed out for 
the other hardware platforms (Mac, Windows). I never committed it because not 
many people have expressed a need to pull their controller from the USB port 
and it wasn't a very clean patch. Seems like a good time to revisit this issue.

Original comment by glsatz on 3 Nov 2012 at 3:35

GoogleCodeExporter commented 9 years ago
Ok it would be nice to include that patch in a revision of OZW, maybe we can 
also have a look to serial port handling if we unplug and replug it again? I 
like to call these enhancements more "stability" fixes (now OZW is getting more 
mature - then we need such code more i think).

Original comment by uAle...@gmail.com on 3 Nov 2012 at 4:05

GoogleCodeExporter commented 9 years ago
Ok. I will need someone to verify a windows (stub) patch. 

Original comment by glsatz on 3 Nov 2012 at 4:57

GoogleCodeExporter commented 9 years ago
Peter's original email about his patch: 
https://groups.google.com/forum/#!topic/openzwave/2VdGhj77i8Q/discussion

Note this adds a new dependency to the Linux build, libudev.

Original comment by glsatz on 4 Nov 2012 at 7:14

GoogleCodeExporter commented 9 years ago
Adding the libudev dependency i don't see as a (major) issue, because we get 
more stability back for it.

Other question, can we add 2 types of notification to allow the main program to 
known the serial port is unavailable (when e.g. disconnect) and available 
again? Currently i don't have any good way to allow my main program to know the 
Open Z-Wave library is working fine or not.

Original comment by uAle...@gmail.com on 4 Nov 2012 at 8:35

GoogleCodeExporter commented 9 years ago
Here is the patch Peter did updated so the platform specific code lives in 
different files instead of using #ifdefs. Not sure this patch is very useful as 
it only checks for the presence of the USB controller at open time. Pulling the 
USB controller still needs to be detected elsewhere. I am not planning on 
committing this patch until we have some better understanding on where and how 
USB controller presence should be managed. Peter reported that he wound up 
moving his detection code into his user program and use the manager object 
calls to open and close the driver. More discussion is needed.

Original comment by glsatz on 5 Nov 2012 at 5:34

Attachments:

GoogleCodeExporter commented 9 years ago
Commets from Peter on how he did this under Linux:

Both the original and the current code I have are essentially stock example udev
code.  See the monitor stuff:

http://www.signal11.us/oss/udev/

There are a few changes because it's a serial port, but those are already
in the code you have.

In my main loop, I have a select loop anyway, so this is just another
file descriptor to monitor.  It's very easy.

Original comment by glsatz on 13 Nov 2012 at 8:26

GoogleCodeExporter commented 9 years ago
Going down the USB route is overkill if you ask me. 

The 100% CPU spin was caused in the Read Member function of the SerialImpl 
class. (not tried/looked at the HID or Windows version). Its root cause was 
that we were not testing for the read system call returning 0 (by the look of 
the code, the original author assumed read returning 0 indicated no data. In 
fact as we are in blocking mode, read returning 0 indicates EOF)

The second problem is that there is no error path upto the Driver (or 
eventually) the Manager class to indicate errors. Thus the driver would happy 
keep trying to read/write etc and the manager (and thus the application) had no 
idea there was a problem)

Attached is a WIP patch, that adds some error handling to at least get the 
Error upto the driver class. I'm still trying to propogate that error to the 
Manager class so we can unload the driver and at least have the driver die 
gracefully, but there is still a loop somewhere in the WriteMsg functions etc. 

I'll revist this patch after the Security Class has settled down a bit. 

Original comment by jus...@dynam.ac on 12 Jun 2014 at 11:53

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks, i will apply the patch on my system and give it a try. Yes, telling via 
a notification, that the device is gone, is a valuable addition to open-zwave 
(but the Security Class first :-)).

Original comment by uAle...@gmail.com on 12 Jun 2014 at 11:59

GoogleCodeExporter commented 9 years ago
Hi, as mentioned, the patch isn't complete. It's just moved the bug to 
somewhere around WriteMsg in the driver class. Feel free to improve it though!

Original comment by jus...@dynam.ac on 12 Jun 2014 at 4:17

GoogleCodeExporter commented 9 years ago
I tested the patch and it doesn't generate a 100% cpu anymore on the 
disconnect, but the driver is now looping in itself as you mentioned (and then 
again 100% cpu again) ;-)

Original comment by uAle...@gmail.com on 21 Jun 2014 at 11:23

GoogleCodeExporter commented 9 years ago

Original comment by jus...@dynam.ac on 15 Oct 2014 at 2:56

GoogleCodeExporter commented 9 years ago

Original comment by jus...@dynam.ac on 15 Oct 2014 at 2:58