Closed DanielG closed 9 years ago
Of course now that I try again today I can't reproduce the second issue, so don't quote me on that but I am pretty sure the first one is an actual issue ;)
EDIT: Never mind that, I just managed to reproduce both of those issues.
Have you tried running the version in the git repo? There is support for UUCP lockfiles which should address this problem.
@DanielG can you pastebin or similar an olad -l 4 of the discovery baud rate bug. If that is happening I doubt our UUCP locks will fix that, given its in theory a device you aren't opening.
Sure I'll just leave that running in the background with some logging and report back when it happens again, yesterday I was doing while true; do ola_rdm_discover -u0 -f; done
which seemed to trigger it every few minutes but now it seems to run fine. Maybe ola needs to have been running for a while to trigger it?
If that is happening I doubt our UUCP locks will fix that, given its in theory a device you aren't opening.
I am opening the device with picocom (which does UUCP locking AFAIK) so I guess that would fix it.
olad checks for new devices every 20s at which point it will open the device and perform an ioctl(). I suspect that's what you're seeing - it'll appear in the logs as well.
Simon
I have seen this happen twice now. Looks like the baud rate glitch happens when ola crashes, so maybe a well meaning destructor somewhere that "resets" the baud rate of the tty even though it's not owned by ola.
My responder isn't exactly well behaved yet so It's probably not ola's fault for crashing ;)
[...]
SelectPoller.cpp:160: ss process time was 0.000000
DiscoveryAgent.cpp:223: Hit failure limit for (ff00:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:230: DUB fe00:00000000 - feff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
EnttecUsbProWidget.cpp:343: Sending DUB packet: fe00:00000000 - feff:ffffffff
SelectPoller.cpp:160: ss process time was 0.000002
DiscoveryAgent.cpp:223: Hit failure limit for (fe00:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:230: DUB fc00:00000000 - fdff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
EnttecUsbProWidget.cpp:343: Sending DUB packet: fc00:00000000 - fdff:ffffffff
SelectPoller.cpp:160: ss process time was 0.000002
DiscoveryAgent.cpp:223: Hit failure limit for (fc00:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:230: DUB f800:00000000 - fbff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
EnttecUsbProWidget.cpp:343: Sending DUB packet: f800:00000000 - fbff:ffffffff
SelectPoller.cpp:160: ss process time was 0.000002
DiscoveryAgent.cpp:223: Hit failure limit for (f800:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:230: DUB f000:00000000 - f7ff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
EnttecUsbProWidget.cpp:343: Sending DUB packet: f000:00000000 - f7ff:ffffffff
SelectPoller.cpp:160: ss process time was 0.000002
DiscoveryAgent.cpp:223: Hit failure limit for (f000:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:230: DUB e000:00000000 - efff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
EnttecUsbProWidget.cpp:343: Sending DUB packet: e000:00000000 - efff:ffffffff
SelectPoller.cpp:160: ss process time was 0.000002
DiscoveryAgent.cpp:223: Hit failure limit for (e000:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:230: DUB c000:00000000 - dfff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
EnttecUsbProWidget.cpp:343: Sending DUB packet: c000:00000000 - dfff:ffffffff
SelectPoller.cpp:160: ss process time was 0.000002
DiscoveryAgent.cpp:223: Hit failure limit for (c000:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:230: DUB 8000:00000000 - bfff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
EnttecUsbProWidget.cpp:343: Sending DUB packet: 8000:00000000 - bfff:ffffffff
SelectPoller.cpp:160: ss process time was 0.000001
DiscoveryAgent.cpp:223: Hit failure limit for (8000:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:230: DUB 0000:00000000 - 7fff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
EnttecUsbProWidget.cpp:343: Sending DUB packet: 0000:00000000 - 7fff:ffffffff
SelectPoller.cpp:160: ss process time was 0.000002
DiscoveryAgent.cpp:223: Hit failure limit for (0000:00000000, ffff:ffffffff)
DiscoveryAgent.cpp:414: top of tree is corrupted
EnttecUsbProWidget.cpp:605: Enttec Pro discovery complete:
Received SIGSEGV or SIGBUS
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(+0x8bb23)[0x7f710eaa8b23]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7f710d7a58d0]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola3rpc10RpcChannel7SendMsgEPNS0_10RpcMessageE+0x32)[0x7f710eb6f9d2]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola3rpc10RpcChannel15RequestCompleteEPNS0_18OutstandingRequestE+0x101)[0x7f710eb714e1]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(+0x9e741)[0x7f710eabb741]
/usr/lib/x86_64-linux-gnu/libolaserver.so.0(_ZN3ola8Universe17DiscoveryCompleteEPNS_13BaseCallback1IvRKNS_3rdm6UIDSetEEE+0xe8)[0x7f710ee3ea68]
/usr/lib/x86_64-linux-gnu/libolaserver.so.0(+0x54232)[0x7f710ee28232]
/usr/lib/x86_64-linux-gnu/libolaserver.so.0(+0x7107e)[0x7f710ee4507e]
/usr/lib/x86_64-linux-gnu/libolaserver.so.0(+0x5e3a2)[0x7f710ee323a2]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola3rdm33DiscoverableQueueingRDMController17DiscoveryCompleteERKNS0_6UIDSetE+0xa2)[0x7f710eb39b72]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(+0x11de71)[0x7f710eb3ae71]
/usr/lib/x86_64-linux-gnu/olad/libolausbpro.so.0(_ZN3ola6plugin6usbpro14EnttecPortImpl17DiscoveryCompleteEPNS_13BaseCallback1IvRKNS_3rdm6UIDSetEEEbS7_+0x56)[0x7f710ae89bb6]
/usr/lib/x86_64-linux-gnu/olad/libolausbpro.so.0(+0x3fd81)[0x7f710ae8bd81]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola3rdm14DiscoveryAgent13SendDiscoveryEv+0x1432)[0x7f710eb0bdf2]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola3rdm14DiscoveryAgent13SendDiscoveryEv+0xe2)[0x7f710eb0aaa2]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola3rdm14DiscoveryAgent14BranchCompleteEPKhj+0x13d)[0x7f710eb1088d]
/usr/lib/x86_64-linux-gnu/olad/libolausbpro.so.0(_ZN3ola6plugin6usbpro14EnttecPortImpl16HandleRDMTimeoutEj+0x16a)[0x7f710ae8830a]
/usr/lib/x86_64-linux-gnu/olad/libolausbpro.so.0(_ZN3ola6plugin6usbpro22EnttecUsbProWidgetImpl11HandleLabelEPNS1_14EnttecPortImplERKNS1_15OperationLabelsEhPKhj+0x118)[0x7f710ae89a38]
/usr/lib/x86_64-linux-gnu/olad/libolausbpro.so.0(_ZN3ola6plugin6usbpro22EnttecUsbProWidgetImpl13HandleMessageEhPKhj+0x60)[0x7f710ae89ae0]
/usr/lib/x86_64-linux-gnu/olad/libolausbpro.so.0(_ZN3ola6plugin6usbpro16BaseUsbProWidget14ReceiveMessageEv+0x26f)[0x7f710ae7ef5f]
/usr/lib/x86_64-linux-gnu/olad/libolausbpro.so.0(_ZN3ola6plugin6usbpro16BaseUsbProWidget15DescriptorReadyEv+0x18)[0x7f710ae7ef88]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola2io27BidirectionalFileDescriptor11PerformReadEv+0xe2)[0x7f710eab54c2]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola2io12SelectPoller16CheckDescriptorsEP6fd_setS3_+0x984)[0x7f710eabea84]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola2io12SelectPoller4PollEPNS0_14TimeoutManagerERKNS_12TimeIntervalE+0x17e)[0x7f710eabf4be]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola2io12SelectServer14CheckForEventsERKNS_12TimeIntervalE+0xe8)[0x7f710eaba968]
/usr/lib/x86_64-linux-gnu/libolacommon.so.0(_ZN3ola2io12SelectServer3RunEv+0xab)[0x7f710eabaa4b]
olad(main+0x691)[0x4032e1]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f710d40eb45]
olad[0x403858]
It shouldn't crash, regardless of what the responder does.
Can you please try the latest code from the git repo. Some of these issues may have already been fixed.
Sure, I'll try it tomorrow.
Still happening with git master and the baud rate is mangled again.
plugins/usbpro/EnttecUsbProWidget.cpp:841: Enttec received label 12, length 0
common/rdm/DiscoveryAgent.cpp:211: BranchComplete, got 0
common/rdm/DiscoveryAgent.cpp:190: Hit failure limit for (c000:00000000, ffff:ffffffff)
common/rdm/DiscoveryAgent.cpp:197: DUB 8000:00000000 - bfff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
plugins/usbpro/EnttecUsbProWidget.cpp:331: Sending DUB packet: 8000:00000000 - bfff:ffffffff
common/io/EPoller.cpp:306: ss process time was 0.000002
plugins/usbpro/EnttecUsbProWidget.cpp:841: Enttec received label 12, length 0
common/rdm/DiscoveryAgent.cpp:211: BranchComplete, got 0
common/rdm/DiscoveryAgent.cpp:190: Hit failure limit for (8000:00000000, ffff:ffffffff)
common/rdm/DiscoveryAgent.cpp:197: DUB 0000:00000000 - 7fff:ffffffff, attempt 1, uids found: 0, failures 0, corrupted 0
plugins/usbpro/EnttecUsbProWidget.cpp:331: Sending DUB packet: 0000:00000000 - 7fff:ffffffff
common/io/EPoller.cpp:306: ss process time was 0.000002
plugins/usbpro/EnttecUsbProWidget.cpp:841: Enttec received label 12, length 0
common/rdm/DiscoveryAgent.cpp:211: BranchComplete, got 0
common/rdm/DiscoveryAgent.cpp:190: Hit failure limit for (0000:00000000, ffff:ffffffff)
common/rdm/DiscoveryAgent.cpp:387: Discovery tree is corrupted
plugins/usbpro/EnttecUsbProWidget.cpp:583: Enttec Pro discovery complete:
Received Segmentation fault
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(+0x97ba3)[0x7fde6c3bdba3]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7fde6b0ac8d0]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola3rpc10RpcChannel7SendMsgEPNS0_10RpcMessageE+0x1f)[0x7fde6c4574af]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola3rpc10RpcChannel15RequestCompleteEPNS0_18OutstandingRequestE+0xe1)[0x7fde6c458f51]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(+0xab8f1)[0x7fde6c3d18f1]
/home/dxld/var/build/ola/olad/.libs/libolaserver.so.0(_ZN3ola8Universe17DiscoveryCompleteEPNS_13BaseCallback1IvRKNS_3rdm6UIDSetEEE+0xc8)[0x7fde6c75cdf8]
/home/dxld/var/build/ola/olad/.libs/libolaserver.so.0(+0x55c12)[0x7fde6c74dc12]
/home/dxld/var/build/ola/olad/.libs/libolaserver.so.0(+0x6b0ae)[0x7fde6c7630ae]
/home/dxld/var/build/ola/olad/.libs/libolaserver.so.0(+0x5b702)[0x7fde6c753702]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola3rdm33DiscoverableQueueingRDMController17DiscoveryCompleteERKNS0_6UIDSetE+0xa2)[0x7fde6c42b1e2]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(+0x106a41)[0x7fde6c42ca41]
/home/dxld/var/build/ola/plugins/usbpro/.libs/libolausbpro.so.0(_ZN3ola6plugin6usbpro14EnttecPortImpl17DiscoveryCompleteEPNS_13BaseCallback1IvRKNS_3rdm6UIDSetEEEbS7_+0x47)[0x7fde68373a77]
/home/dxld/var/build/ola/plugins/usbpro/.libs/libolausbpro.so.0(+0x40ca1)[0x7fde68375ca1]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola3rdm14DiscoveryAgent13SendDiscoveryEv+0x147d)[0x7fde6c3fd5cd]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola3rdm14DiscoveryAgent13SendDiscoveryEv+0xd9)[0x7fde6c3fc229]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola3rdm14DiscoveryAgent14BranchCompleteEPKhj+0xc0)[0x7fde6c401fc0]
/home/dxld/var/build/ola/plugins/usbpro/.libs/libolausbpro.so.0(_ZN3ola6plugin6usbpro14EnttecPortImpl16HandleRDMTimeoutEj+0x14a)[0x7fde6837163a]
/home/dxld/var/build/ola/plugins/usbpro/.libs/libolausbpro.so.0(_ZN3ola6plugin6usbpro22EnttecUsbProWidgetImpl11HandleLabelEPNS1_14EnttecPortImplERKNS1_15OperationLabelsEhPKhj+0xe0)[0x7fde68372e40]
/home/dxld/var/build/ola/plugins/usbpro/.libs/libolausbpro.so.0(_ZN3ola6plugin6usbpro22EnttecUsbProWidgetImpl13HandleMessageEhPKhj+0xab)[0x7fde68372f2b]
/home/dxld/var/build/ola/plugins/usbpro/.libs/libolausbpro.so.0(_ZN3ola6plugin6usbpro16BaseUsbProWidget14ReceiveMessageEv+0x24b)[0x7fde68367d8b]
/home/dxld/var/build/ola/plugins/usbpro/.libs/libolausbpro.so.0(_ZN3ola6plugin6usbpro16BaseUsbProWidget15DescriptorReadyEv+0x18)[0x7fde68367da8]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola2io27BidirectionalFileDescriptor11PerformReadEv+0xc2)[0x7fde6c3ca692]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola2io7EPoller15CheckDescriptorEP11epoll_eventPNS0_9EPollDataE+0x5b)[0x7fde6c3d67cb]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola2io7EPoller4PollEPNS0_14TimeoutManagerERKNS_12TimeIntervalE+0x223)[0x7fde6c3d6c03]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola2io12SelectServer14CheckForEventsERKNS_12TimeIntervalE+0xd8)[0x7fde6c3d0b98]
/home/dxld/var/build/ola/common/.libs/libolacommon.so.0(_ZN3ola2io12SelectServer3RunEv+0x83)[0x7fde6c3d0c43]
/home/dxld/var/build/ola/olad/.libs/lt-olad(main+0x6b5)[0x403565]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fde6ad15b45]
/home/dxld/var/build/ola/olad/.libs/lt-olad[0x403741]
Sorry @DanielG I meant if you've filtered to enttec_* then if it's still opening something else somehow, it may well be doing it in error and ignoring the UUCP too.
Can we see the whole olad log please (up to the start of a discovery, even if it doesn't crash will be sufficient)?
I'm guessing the callback is failing due to the corrupt tree: https://github.com/OpenLightingProject/ola/blob/master/plugins/usbpro/EnttecUsbProWidget.cpp#L585
Sure, http://pastebin.com/F87qR83U. I got the responded to behave now so the discovery flood I was getting before isn't there anymore.
Should should be resolved with the latest changes in git. Please let us know if it isn't.
In it's default configuration ola 0.9.1 from Debian jessie periodically takes over tty devices I use for debugging (I'm developing an RDM responder atm.). I had to disable all plugins except the one I'm using for my DMX interface (ENTTEC DMX USB PRO Mk2) and mess with udev to get it to stop.
The symptom of ola's hostile takeover is that the picocom running on my debug tty just closes on me every so often and I have to restart it, I don't remember the error off hand but I can revert my config and reproduce it if it's important.
For reference my
/etc/udev/rules.d/90-enttec.rules
and
/etc/ola/ola-usbserial.conf
Now this is all nice and well and but now I have another even stranger problem. Whenever I run
ola_rdm_discover
ola seems to somehow change the baud rate on a non enttec ttyUSB device!I have to run
stty <baud-rate> --file=/dev/ttyUSBX
to get it to go back to normal.This is quite annoying as I'm using that tty to debug a crash on my device that happens during discovery :/