Open davidaramantSEP opened 11 years ago
Sounds similar to #103 . Apple changed something in Mountain Lion. They break stuff related to HID in every major release. On top of that, there are races that I'm not sure there are any good ways to work around.
I don't have 10.8, and I don't know what it is that changed, unfortunately. dantulurisudheer says he got it working. I asked him for a patch, and haven't heard from him (I just pinged him again).
If you can try to reproduce the problem with a simple command line program (or some testgui hacks), it might be easier to diagnose the problem.
When you say that it happens in a loop with a 30 second timeout, do you mean the timeout on the read is 30 seconds? (so it waits for 30 seconds in case the device sends data, if not then it moves to the next protocol/transport?)
At what point in the sequence are you disconnecting/reconnecting the device? During the timeout? Also, do you have multiple 'transports'/matching devices connected at the same time? Or just one device?
Hey mrpippy. It's been a while.
Are you using 10.8 without problems?
mrpippy: 30 seconds refers to the entire connection attempt. We will execute that pseudocode loop for roughly 30 seconds (we don't interrupt the loop, which is the "roughly" 30 seconds part). The read timeout is 2.5 seconds but the transport returns data much faster than that (I would guess 100 ms). Multiple transports is probably something I shouldn't even have mentioned, since I'm only testing with one for now.
I'm not sure at what point I unplugged the device since the connection attempt it is pretty speedy.
signal11: I saw issue 103 earlier and it looked promising... Hopefully you will hear back about it.
The Apple dev world is proving to be quite an experience for us so far, coming from MS land.
I'll work on making a simpler program to reproduce the issue. I'll have to brush up on my C++ :)
signal11: When 10.8 came out I tested my app and was happy with how it worked. It's using an older/slightly hacked-up version of hidapi though, and if any users are having problems the complaints might not make it back to me.
Also I just compared the HID Manager source code (it's hid.subproj, part of IOKitUser) between 10.7 and 10.8, and there's almost no changes to the HID Manager itself. The IOKit HID layer underneath definitely had changes though.
davidaramantSEP:
Ah ok, so the time between opening the device and closing the device is around 30 seconds (and almost all that time is spent calling hid_read/write())?
You said it often crashes when closing a device--is it actually crashing in hid_close()
? Does that happen without any connection/reconnection going on? Would you have a stack trace?
I think I made things more confusing by trying to simplify things :)
The 30 second timeout can also be ignored. The application has two modes where you either 'manually' scan for devices or it continually polls in the background (forever and ever and ever). Each manual attempt is capped at 30 seconds before we give up. We haven't gotten around to the auto-detect portion yet, so the 30 seconds happened to be on my mind.
The pseudo code I posted is for one connection attempt, which takes around 5-7 seconds. In the "manual" case we get around 5-6 attempts done; in the auto-detect case it is unbounded. We close all open USB devices at the end of each unsuccessful connection attempt. The user could plug-in or unplug a transport at any time, so we have to rescan to pick up any new ones.
There are probably over a hundred write/read calls during each connection attempt. The IR dongle is programmable; we send it scripts to run that in turn operate the IR emitter to search for devices. Each script we send has a response (even if it's "no data received") so theoretically we would never hit a read timeout in normal operation.
The crash is very frustrating. hid_close()
actually returns fine (we have log statements after it that show up) but it crashes shortly thereafter. Something similar to the big snippet in my original post is always present in the stack trace when it crashes - references to callbacks and runloops. If we sleep for "a while" (lets say 100 ms) after each read/write/close, it doesn't crash (but our communication speed obviously tanks). This is why it seems like a threading issue to me.
Is your code doing something like:
hid_close()
print(statements that show up)
hid_open()
CRASH
And it crashes less often if you put a 50ms delay between hid_close() and hid_open()? Or does it crash after hid_close() without any other hid function being called?
Sometimes it crashes after hid_close
completes, sometimes it gets far enough to start hid_open
, sometimes hid_open
completes...
The sleeps are after close, read, and write. I'm not sure all three of those sleeps are needed; I haven't tested all the permutations of it.
So I hacked up hidtest last night to do roughly the same loop as your code:
while(1)
{
enumerate devices
open device
2000 write/read operations (60 byte packets)
close device
}
And it seemed to run ok. I'll do more testing tonight (including more disconnect testing), but I never had a segfault. I also ran hid_enumerate() in a loop and tested plugging/unplugging my device, and the number of devices detected was also correct.
I'm using 10.8.3 on a 2011 MacBook Air. What type/year Mac are you using? Also, do you have any other USB devices connected?
Mid 2011 Mac Mini, i7 2.7GHz, 10.8.3. There's a mouse, keyboard, and a cheap hub that the transport is plugged into.
Another wrinkle I just realized is that our app has lots of threads all over the place. I've made sure that an opened handle is only accessed by one and only one thread, but I'm wondering if there is a problem with shutting down that thread right after hid_close
completes. I'm still fuzzy on the run loop concept in Core Foundation.
I think all managed .NET threads turn into pthreads under the hood on Mono (on Windows there is not a 1:1 mapping between managed & native threads).
A closer match to our application would be to do everything inside that while
loop in a new thread. Parts of our application is over 7 years old, and the threading design is a bit overzealous to put it mildly.
BTW, in hid.c, what does line 980 accomplish?
IOHIDDeviceScheduleWithRunLoop(dev->device_handle, CFRunLoopGetMain(), kCFRunLoopDefaultMode);
The previous line unschedules the device from the read thread run loop, but I don't get why it needs to be put back on the main one.
Ah ok, I'll try it with more threads.
As for hid.c line 980, I don't know why the device needs to be put onto the main runloop. The callbacks have just been unregistered, and there shouldn't be any more I/O going to the device.
Also, IOHIDDeviceClose()
unregisters the callbacks itself, so I'm wondering if hid_close()
could be simplified to look like this:
void HID_API_EXPORT hid_close(hid_device *dev)
{
if (!dev)
return;
/* Close the OS handle to the device, but only if it's not
been unplugged. If it's been unplugged, then calling
IOHIDDeviceClose() will crash. */
if (!dev->disconnected) {
IOHIDDeviceClose(dev->device_handle, kIOHIDOptionsTypeSeizeDevice);
}
/* Cause read_thread() to stop. */
dev->shutdown_thread = 1;
/* Wake up the run thread's event loop so that the thread can exit. */
CFRunLoopSourceSignal(dev->source);
CFRunLoopWakeUp(dev->run_loop);
/* Notify the read thread that it can shut down now. */
pthread_barrier_wait(&dev->shutdown_barrier);
/* Wait for read_thread() to end. */
pthread_join(dev->thread, NULL);
/* Clear out the queue of received reports. */
pthread_mutex_lock(&dev->mutex);
while (dev->input_reports) {
return_data(dev, NULL, 0);
}
pthread_mutex_unlock(&dev->mutex);
CFRelease(dev->device_handle);
free_hid_device(dev);
}
I tried your new close method without any of my hacks. It actually ran for a few minutes before crashing (this is much better than before). The crash is also much more interesting:
objc[70400]: Object 0x7b9056d0 of class __NSCFString autoreleased with no pool in place - just leaking - break on objc_autoreleaseNoPool() to debug
2013-04-09 18:21:05.958 Device Link[70400:234f] -[__NSCFType bytes]: unrecognized selector sent to instance 0x7b905630
2013-04-09 18:21:05.959 Device Link[70400:234f] An uncaught exception was raised
2013-04-09 18:21:05.959 Device Link[70400:234f] -[__NSCFType bytes]: unrecognized selector sent to instance 0x7b905630
2013-04-09 18:21:06.415 Device Link[70400:234f] (
0 CoreFoundation 0x96161e9b __raiseError + 219
1 libobjc.A.dylib 0x970e952e objc_exception_throw + 230
2 CoreFoundation 0x96165b0d -[NSObject(NSObject) doesNotRecognizeSelector:] + 253
3 CoreFoundation 0x960ade97 ___forwarding___ + 487
4 CoreFoundation 0x960adc42 _CF_forwarding_prep_0 + 50
5 CoreFoundation 0x9603d110 CFDataGetBytePtr + 80
6 IOKit 0x940e2516 __IOHIDDeviceReportCallbackRegistered + 29
7 IOHIDLib 0x09b1d80b _ZN16IOHIDDeviceClass25_hidReportHandlerCallbackEPviS0_ + 283
8 IOHIDLib 0x09b1fcf1 _ZN15IOHIDQueueClass24queueEventSourceCallbackEP12__CFMachPortP17mach_msg_header_tlPv + 43
9 IOHIDLib 0x09b1b941 _ZN16IOHIDDeviceClass19_cfmachPortCallbackEP12__CFMachPortP17mach_msg_header_tlPv + 77
10 CoreFoundation 0x9602847f __CFMachPortPerform + 303
11 CoreFoundation 0x96028335 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 53
12 CoreFoundation 0x96027f62 __CFRunLoopDoSource1 + 146
13 CoreFoundation 0x9605dcfa __CFRunLoopRun + 2154
14 CoreFoundation 0x9605d02a CFRunLoopRunSpecific + 378
15 CoreFoundation 0x9605ce9b CFRunLoopRunInMode + 123
16 libUsbHid.dylib 0x0919c2b2 read_thread + 290
17 libsystem_c.dylib 0x9599e5b7 _pthread_start + 344
18 libsystem_c.dylib 0x95988d4e thread_start + 34
)
2013-04-09 18:21:06.416 Device Link[70400:234f] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[__NSCFType bytes]: unrecognized selector sent to instance 0x7b905630'
*** Call stack at first throw:
(
0 CoreFoundation 0x96161e9b __raiseError + 219
1 libobjc.A.dylib 0x970e952e objc_exception_throw + 230
2 CoreFoundation 0x96165b0d -[NSObject(NSObject) doesNotRecognizeSelector:] + 253
3 CoreFoundation 0x960ade97 ___forwarding___ + 487
4 CoreFoundation 0x960adc42 _CF_forwarding_prep_0 + 50
5 CoreFoundation 0x9603d110 CFDataGetBytePtr + 80
6 IOKit 0x940e2516 __IOHIDDeviceReportCallbackRegistered + 29
7 IOHIDLib 0x09b1d80b _ZN16IOHIDDeviceClass25_hidReportHandlerCallbackEPviS0_ + 283
8 IOHIDLib 0x09b1fcf1 _ZN15IOHIDQueueClass24queueEventSourceCallbackEP12__CFMachPortP17mach_msg_header_tlPv + 43
9 IOHIDLib 0x09b1b941 _ZN16IOHIDDeviceClass19_cfmachPortCallbackEP12__CFMachPortP17mach_msg_header_tlPv + 77
10 CoreFoundation 0x9602847f __CFMachPortPerform + 303
11 CoreFoundation 0x96028335 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 53
12 CoreFoundation 0x96027f62 __CFRunLoopDoSource1 + 146
13 CoreFoundation 0x9605dcfa __CFRunLoopRun + 2154
14 CoreFoundation 0x9605d02a CFRunLoopRunSpecific + 378
15 CoreFoundation 0x9605ce9b CFRunLoopRunInMode + 123
16 libUsbHid.dylib 0x0919c2b2 read_thread + 290
17 libsystem_c.dylib 0x9599e5b7 _pthread_start + 344
18 libsystem_c.dylib 0x95988d4e thread_start + 34
)
Interesting, you can see that it crashed in read_thread()
there. Do you know what was happening when it crashed? (I assume the device was open and it was reading/writing?)
Unfortunately I didn't check the dev log on that crash before it got erased. It was executing our connection loop, so it was most likely reading/writing.
I haven't been able to reproduce that exact crash again. So far it bombs with much like it did before: 1 - 3 minutes of running the connection loop, usually inside of a hid_read
or hid_close
call.
The crash info starts with this:
Native stacktrace:
0 Device Link 0x00140154 mono_handle_native_sigsegv + 292
1 Device Link 0x0016bf85 mono_sigsegv_signal_handler + 197
2 libsystem_c.dylib 0x9598a8cb _sigtramp + 43
3 ??? 0xffffffff 0x0 + 4294967295
4 IOKit 0x940e2516 __IOHIDDeviceReportCallbackRegistered + 29
5 IOHIDLib 0x09c7980b _ZN16IOHIDDeviceClass25_hidReportHandlerCallbackEPviS0_ + 283
6 IOHIDLib 0x09c7bcf1 _ZN15IOHIDQueueClass24queueEventSourceCallbackEP12__CFMachPortP17mach_msg_header_tlPv + 43
7 IOHIDLib 0x09c77941 _ZN16IOHIDDeviceClass19_cfmachPortCallbackEP12__CFMachPortP17mach_msg_header_tlPv + 77
8 CoreFoundation 0x9602847f __CFMachPortPerform + 303
9 CoreFoundation 0x96028335 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 53
10 CoreFoundation 0x96027f62 __CFRunLoopDoSource1 + 146
11 CoreFoundation 0x9605dcfa __CFRunLoopRun + 2154
12 CoreFoundation 0x9605d02a CFRunLoopRunSpecific + 378
13 CoreFoundation 0x9605ce9b CFRunLoopRunInMode + 123
14 libUsbHid.dylib 0x091ec2b2 read_thread + 290
15 libsystem_c.dylib 0x9599e5b7 _pthread_start + 344
16 libsystem_c.dylib 0x95988d4e thread_start + 34
I can post the entire thing, but it's enormous since there are 18 threads active. None of them look particularly interesting except for a thread that has a lot of stuff about HID callbacks in it's stacktrace (you can see an example of a trace like that in my first post).
The main thing that I see in every one of your crash logs is __IOHIDDeviceReportCallbackRegistered + 29
: __IOHIDDeviceReportCallbackRegistered()
is the internal HID Manager function that calls the callback registered with IOHIDDeviceRegisterInputReportCallback()
.
I've looked closer at the IOHIDManager code, and I think there are race conditions that could cause this crash if input reports are received while a device is being closed, or possibly after an input report callback is unregistered. I'm gonna try to test this more tonight.
The HID Manager code gives me the impression that it's not designed to be thread-safe, i.e. calls to IOHIDDeviceClose()
should not happen while a run loop (that could dispatch callbacks) is running.
You said earlier that you access an opened device only on one thread, but does that mean you're doing hid_init()
or hid_enumerate()
on a different thread? Would it be possible to do all your hidapi calls from a single thread?
Your investigation sounds promising!
I tried using a single thread for all API calls but I thought it had the same problem. I can try again, I suppose. Since we are supposed to support multiple transports, using a single thread for all communication would be a bit of a bottleneck.
Currently there is one thread that "owns" hid_init
and hid_exit
, one thread for enumeration, and each device gets a new thread for open/read/write/close.
I don't think that running hid_init()
and hid_enumerate()
on different threads is a good idea. hid_init()
creates the IOHIDManager
and schedules it on the current run loop (which is per-thread). hid_enumerate()
later runs the current run loop quickly to process some events, but this won't work as intended if they're in separate threads.
I just hacked up hidtest
to call hid_init()
from a newly-created pthread, and it segfaults pretty quickly, either in hid_init()
or later on. I'm not sure why, but I think it's because a raw pthread needs to have a run loop that's being run in order to work.
(I was trying to figure out exactly how mono/xamarin implements threads and what they do with run loops for user-created threads, but didn't see any info. I'll probably have to look at the source)
I've re-confirmed that running all the API calls on the same thread still bombs with the same error. I'll also investigate what Mono does with threads.
I found the code for the managed thread and some of the unmanaged code.
It looks like Mono's threads are built on top of GLib, which uses straight pthreads on the Mac. I'm not sure, but I doubt Mono is messing with the run loop at all for user-created threads.
Also it turns out I had some problems with my hidtest
code, with those fixed running hid_init()
from a separate thread works fine.
Does your device send input reports only in response to a report from the host, or does it send them all the time?
The transport uses a strictly request-response protocol; the host is in charge of everything. There should be no unexpected reports from the transport.
So does your newly created pthread do anything with regards to run loops? Are you reading/writing on a separate thread as well?
If I do have to do something with run loops, is it enough to call CFRunLoopRun
when the thread starts and CFRunLoopStop
before it is shut down?
Well, calling CFRunLoopStop
before the thread stops seems to insta-kill the entire process without any type of error. CFRunLoopRun
when it starts doesn't seem to fix anything either, but it does make the program hang when you unplug the transport.
Hi Guys,
Sorry I've been unresponsive. Run Loops with respect to threading is not something I was ever able to find real clear authoritative documentation on. My understanding of how it works was mostly derived empirically, and also with the help of mrpippy back a couple years ago.
In reading some of the git log, I'm reminded that I came to the realization at one point that Apple intends for an application to have an IOHIDManager handle for each device. The way we currently do it is to have one IOHIDManager and then an IOHIDDevice for each device. I'm not sure that matters to this particular case, but my experience has been that when you don't do stuff exactly the way they want you to, you have problems (and it's not always clear what they want you to do).
I also wonder some about having a lot of threads. HIDAPI on mac should be threadsafe, but it doesn't mean there aren't bugs. hid_init() should be called one time at the start of execution, per the documentation, preferably from the main thread.
I'm also suspicious about the mono stuff. Is there a minimal test program which shows the symptoms? Maybe start with hidtest, and then build onto it (make it more complicated) until you see the bad behavior.
I also wonder, since you're calling hid_enumerate() from a lot of different threads, whether it needs to be mutexed, and then have the run loop moved to the current thread before process_pending_events() is called.
There's also an issue with dev->disconnected. When the device has been disconnected, it will full-up crash if certain functions are called, even if the handle is still open. The only way I know to get around this is to set dev->disconnected in the device removal callback, and then check for it all over the place. This seems racy, but I've yet to find a better way to do it. IIRC, mrpippy tested this pretty hard with open/read/write/close/unplug (in random order). Maybe that was someone else.
I hope maybe something I've said here will trigger a light bulb.
Alan.
davidaramantSEP:
I resurrected a hidapi branch that used IOKit paths, give it a try and see if it helps.
The main difference is that hid_open_path()
doesn't call hid_enumerate()
, so it might work better when hid_open()
is called in a separate thread.
The changes are all in mac/hid.c
, you should be able to just swap it in to your code. (note that it doesn't have any of the hid_close()
changes I suggested earlier)
https://github.com/mrpippy/hidapi/tree/iokit_path
signal11: I think our use of IOHIDManager and IOHIDDevice is correct: one manager is used to enumerate devices, and an IOHIDDevice is created for each device. Our usage is a little different than Apple's intent, since we enumerate all devices and do VID/PID matching manually (rather than setting the matching criteria in the IOHIDManager), but it doesn't seem to be a problem. The documentation does leave out a lot of possible use cases though, which makes it hard to know which ones will work and which won't.
Also, HID Manager definitely has bugs! If you register an input report callback, then unregister it, it will segfault when the next input report comes in. I just added this as issue #116, but I think the only effect for hidapi is a possible race condition in hid_close()
Also, HID Manager definitely has bugs! If you register an input report callback, then unregister it, it will segfault when the next input report comes in. I just added this as issue #116, but I think the only effect for hidapi is a possible race condition in hid_close()
Isn't that exactly what this issue is about? (well one of the things)
He said his device uses a request-response protocol, so there shouldn't be input reports coming in during hid_close()
. Still, I suggested some hid_close()
changes in https://github.com/signal11/hidapi/issues/114#issuecomment-16142620 and he said it lasted much longer without crashing, so I guess it is solving some problem!
I tried the IOKit path hid.c with the new hid_close
method you posted earlier. Unfortunately it still crashed with the __IOHIDDeviceReportCallbackRegistered + 29
error.
I'm going to dust off a console program that I was testing with earlier (still using Mono) and see if the behavior is any different. I think it worked better, but I don't know if I bothered leaving it running for very long.
In parallel to getting an actual fix, I was progressing with a series of hacks to work around the two problems we see but the workarounds stopped... working. I'll explain our terminology again: [Application] <- USB -> [Transport] <- IR -> [Device]
The transports are normally not very interesting to us; it's talking to the IR devices on the other end that represent 95% of our code base.
Not finding transport after unplugging/replugging: Lazily call hid_init
before it's needed; use reference counting to call hid_exit
when the last open USB device was closed. This used to work, but now it doesn't - the transport is never found by hid_enumerate
after it's been replugged (identical behavior to not using this hack).
Sleep thread after read/write/close to prevent crashing: This seems to work fine, but it utterly destroys communication speed. We've discovered that the sleeps are not needed after it finds an IR device and starts talking to it, so they are actually toggled off at runtime. There is a lot more processing that happens when talking to an actual IR device, so I doubt we send as many reports back and forth.
We can live with the second hack, but not be able to discover transports is pretty fatal.
Correction: With the IOKit path patch, it does find the transport again, but the path returned is empty. Could that be because the USB device hasn't been fully closed yet?
In your latest tests, what thread are you calling hid_init()
from, and what thread does hid_enumerate()
get called from?
hid_init()
creates the IOHIDManager and attaches it to the current thread's run loop. If that thread's run loop never gets run again, the HID Manager's list of devices probably won't be updated correctly.
I think the path is empty because the HID Manager's device info is stale (because it's run loop isn't running), so the device being queried was already unplugged.
You're right; the change I made to make sure Enumerate and Init happen on the same thread got lost in all the mess of applying/undoing hacks. When I redo that change, rediscovering a USB device works again without having to continually call exit/init (hooray).
So what's the answer then? What needs to change in HIDAPI?
It sounds like you needed to call hid_enumerate() from the same thread as hid_init(). hid_enumerate() already calls process_pending_events(). Does the run loop need to get moved to the hid_enumerate() thread before the call to process_pending_events()? Does it then need to get moved back to the read_thread afterwards? In so doing, does it need to be mutexed with the read thread?
I'm pretty sure that crash is still in there, so only one of the problems has been fixed so far. I'm currently running with the IOKit branch + new hid_close
method.
Moving the run loop in hid_enumerate
seems to make sense, although I'm not sure what would happen to hid_open
which calls it internally. We use hid_open_path
so we don't have to worry about it, but what would happen if you tried to open two devices at the same time from different threads?
I think the hid_init
calls inside of hid_open_path
and hid_enumerate
complicate things as well. This seems like it would just introduce potentially weird behavior if you never call it explicitly and your program is as thread-happy as ours is - returning an error if it hasn't been called seems more appropriate.
I'm a bit confused by what you mean by the read thread. There should only be a read thread if a device is open, right? I don't know if scanning for devices is supposed to be supported when there is currently a handle open; we never do that either.
You guys have mentioned code changes you've made. Are there any patches?
I haven't made any. I just used hid.c from the IOKit branch mentioned in this thread with the new close method posted above.
We got HIDAPI into a usable state using those two changes so I've had to focus on other parts of the application. I did find out that the sleep after hid_read_timeout
was not needed, but we still need a 50 ms sleep after hid_write
and hid_close
. However, this is only needed when we are looking for devices; once we find one, we can toggle the sleeps off without any crashes occurring.
I'm not very happy about having that hack in there and not getting to the bottom of the problem, but unfortunately we have to be a bit pragmatic about where we spend time in order to meet our deadline.
For the hid_close() modifications, I see you took out the unregistering of the report callbacks. Everything that is in there was put in there for a reason, because it was necessary on some version or other of OSX (maybe those reasons are gone on newer versions though)). I seem to remember that when porting to Lion (and I use the term "port" because getting this to run on Lion was a disaster) the input report callback needed to be explicitly unregistered, or else it would keep getting called even after IOHIDDeviceClose(). Maybe that's not necessary anymore.
Three years later... any news on this?
In my HIDAPI clone (PureJavaHidApi) I'm experiencing something similar on Yosemite. My lib is Java but the code logic was lifted from HIDAPI and is almost identical.
If I open a device and unplug/replug it device, then I cannot open it again.
I traced this to the fact that after unplugging IOHIDManagerCopyDevices still returns a list where the device is present even if it has been unplugged. After re-plugging the list then contains to instances of that device.
When the list is then searched the wrong ('old') device is found and naturally opening that fails.
So somehow the device is not properly closed (in my code) but left dangling. I've tried various ways to close the device in the removal call back but without success.
As a work around I changed my open code to use the last matching entry in the list returned by IOHIDManagerCopyDevices, this improves things but unfortunately sometimes the latest plugged in incarnation of the device is not the last in the list.
Anyway, posting this here just in case this issue is still unsolved or someone else can find some use for this titbit of debugging information. And also fishing for solution to my problem of course ;)
Posting this here also for completeness, i.e. while my previous post was about PureJavaHidApi and not signal11/hidapi I'm posting the solution here for posterity in case someone ends up looking for solution to this kind of behaviour.
After spending all spare time for six days finally nailed it to this:
I did not call
IOHIDManagerSetDeviceMatching(hidmgr, null);
before (in device enumeration) calling:
IOHIDManagerCopyDevices(hidmgr);
What an insidious feature in IOKit, without that call everything seems to work, except if you unplug a device then something is left dangling inside IOKit and IOHIDManagerCopyDevices will return the unplugged device among other devices.
br Kusti
OS X 10.8.3, latest version from git, USB device is known to be good (we have been using it for 7 years on Windows).
Our application has a talks to a proprietary USB IR dongle (our 'transport') which in turn talks to a variety of IR devices. Our connect loop looks something like this in pseudo code:
The above sequence happens in a loop with a timeout (around 30 seconds). There are usually multiple writes/reads for each device protocol; the end result being that we send tons of these requests to the HID layer in a short period of time.
We are seeing problems with the hidapi library that I think are related:
It feels to me like the device is not being properly closed or there is a race condition. The stack traces following a crash always include something about HID callbacks, which seems related:
With the hack to activate/deactivate hidapi I saw a different crash which also seems to be related to threading:
There are a number of complications with us attempting to debug this issue:
We're kind of stuck. I think with a combination of hacks we might be able to limp along, but it would be better if we got the bottom of what is happening. Any ideas or suggestions you have would be appreciated.