monome / norns

norns is many sound instruments.
http://monome.org
GNU General Public License v3.0
630 stars 145 forks source link

grid/arc insertion/removal management #11

Closed tehn closed 7 years ago

tehn commented 7 years ago

we need a sensible approach to grid connection and removal mid-process.

furthermore we might need to consider multi-grid and grid+arc management. the goal is regularity, where plug order doesn't matter. this may require some sort of persistent indexing or hierarchy, auto-generated and stored in a config file.

i will update this issue with some use cases.

tehn commented 7 years ago

or for a first pass we could limit the system to one grid and one arc.

but hotswapping is absolutely needed.

catfact commented 7 years ago

turns out it's easy to get basic hotswapping with libudev: https://gist.github.com/catfact/5ec9ce9af77c0185c88026c52f38b98d

i figure we just want one routine somewhere checking usb device changes and taking appropriate action for device types.

however, it's not so clear that we'll be able to shut down libmonome event loop on disconnect, before it has a chance to emit any spurious events. (which it will definitely do if no action is taken)

catfact commented 7 years ago

ok, udev monitoring basically works, pushed here: https://github.com/catfact/norns/tree/udev-monitor

however, there's a big caveat, as i feared: when used directly with a serial device, libmonome behaves badly when its file descriptor goes away (device unplugged.) it just continues to read the FD, or something, resulting in undefined behavior (emitting spurious garbage or segfaulting, depending on the whims of chance.[*])

here are some possible solutions:

  1. don't use libmonome, use serialosc. yet another OSC server in the mix; also, serialosc starts and kills a whole OSC server thread on each device connect/disconnect, which we then have to watch for. blech.
  2. fix libmonome's posix event loop to behave better. (it's not obvious at first glance how to do so; i'd think checking the return value of select would be enough, but i never see any perror output. ugh.)

i'll make some attempt at no.2 since it would be cleanest for our application, and of general benefit for the monome development community.


[*] a tangential issue: maiden, the parent process, isn't aware when its child process goes down from a segfault. of course we don't want this happening, but if it does happen we should be able to inform the user.

catfact commented 7 years ago

ok, couple of things:

here's the gist

tehn commented 7 years ago

you've possibly observed this already, but i believe serialosc also uses udev

On Fri, Mar 10, 2017 at 1:34 AM, ezra buchla notifications@github.com wrote:

ok, couple of things:

-

i am now not sure what is going on with the segfaults and spurious callbacks. they never happen with the libmonome basic example https://github.com/monome/libmonome/blob/master/examples/simple.c. and the symptoms have been variable in matron - sometimes, as in the example, nothing happens on disconnect (the monome event loop just blocks forever), sometimes there are segfaults and spurious callbacks. the behavior changes between clean builds with code changes, so this looks like clobbered memory to me. matron is not doing anything that the example isn't doing, so i am still tentatively assuming this is a libmonome bug.

anyways, i did put together a small test / proof of concept for detecting disconnects. i was reminded that at least for FTDI devices on posix (or anything using the usb-serial driver i think,) the behavior is that reads on the device continue to be valid, but the length of each read is zero bytes (handler has been replaced with a dummy.) this condition should never occur otherwise, so when we see it we can go back to waiting for a valid file descriptor (checking /dev/ttyUSBx in the example; in norns i would wait for an "add" event from udev.)

here's the gist https://gist.github.com/catfact/134954d19f18db85b21dbe3e58cd9f93

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/catfact/norns/issues/11#issuecomment-285590121, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPEcF7YDZCDgGbRb6trpF8LhsT0q3Juks5rkO8OgaJpZM4MTq45 .

catfact commented 7 years ago

yeah it does.

my concern isn't really how to detect the disconnect per se, it's that libmonome's event loop does weird stuff when the device goes away. (which is weird, because it should be easy to catch errors or zero-byte reads as demonstrated in the gist above.) i'm assuming it's a clobbered memory problem that doesn't happen to have any ill effects within serialosc. but here it sometimes brings down the whole matron process...

i've finally managed to catch a stacktrace; it's not that helpful but does pinpoint the problem to that function:

Thread 3 "matron" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff597f700 (LWP 16792)]
0x00007ffff72516bf in monome_event_loop () from /usr/local/lib/libmonome.so.1
(gdb) bt
#0  0x00007ffff72516bf in monome_event_loop () from /usr/local/lib/libmonome.so.1
#1  0x0000000000403c23 in m_run (p=0x0) at src/m.c:19
#2  0x00007ffff7776454 in start_thread () from /usr/lib/libpthread.so.0
#3  0x00007ffff6f997df in clone () from /usr/lib/libc.so.6
(gdb) 

(NB: the null pointer argument to m_run is on purpose; required by pthreads)

i might try linking a static debug build of libmonome so i can drill down deeper.

catfact commented 7 years ago

ok, now running debug libmonome:

Thread 3 "matron" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff597f700 (LWP 17849)]
0x00007ffff72516bf in monome_event_loop (monome=0x61b170) at ../src/platform/posix.c:219
219         if( !handler->cb )
(gdb) bt
#0  0x00007ffff72516bf in monome_event_loop (monome=0x61b170) at ../src/platform/posix.c:219
#1  0x0000000000403c1e in m_run (p=0x0) at src/m.c:19
#2  0x00007ffff7776454 in start_thread () from /usr/lib/libpthread.so.0
#3  0x00007ffff6f997df in clone () from /usr/lib/libc.so.6
(gdb) l
214 
215         if( !monome->next_event(monome, &e) )
216             continue;
217 
218         handler = &monome->handlers[e.event_type];
219         if( !handler->cb )
220             continue;
221 
222         handler->cb(&e, handler->data);
223     } while( 1 );
(gdb) p handler
$1 = (monome_callback_t *) 0xf59e121a8
(gdb) p handler->cb
Cannot access memory at address 0xf59e121a8
(gdb) 

so monome->next_event(), which does the actual read loop, should probably return zero when the device is disconnected, and loop back to select(). but evidently it doesn't, ends up looking up the handler for a bogus event. (and indeed, e.event_type = 4120377088, which doesn't look good.)

i'll try and figure out exactlty where the next_event pointer is taking us (guessing here) and keep poking....

catfact commented 7 years ago

ok yeah... monome->next_event() is a twisty path, but in the end it returns -1 for a disconnected device (at least with linux+posix+mext), and !(-1) == 0, so line 215 is not a good check. i'll see if that fixes it

catfact commented 7 years ago

ok yeah, that fixes it for me.

opened this PR on libmonome, maybe will or nedko have more ideas. https://github.com/monome/libmonome/pull/48

tehn commented 7 years ago

wow, thank you for finding this! wondering if that'll stabilize serialosc as well.

On Fri, Mar 10, 2017 at 3:23 PM ezra buchla notifications@github.com wrote:

ok yeah, that fixes it for me.

opened this PR on libmonome, maybe will or nedko have more ideas. monome/libmonome#48 https://github.com/monome/libmonome/pull/48

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/catfact/norns/issues/11#issuecomment-285775045, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPEcO5BkwmfviWhDyh9JCaVG8AFUjwtks5rkbFBgaJpZM4MTq45 .

catfact commented 7 years ago

ok, there is enough working to do the following, should be a quick task (i'll get to it a little later)

add_device = function( name, type, cols, rows...) if type == 'grid' then monomes[name] = { ['press'] = function (name, x, y) ... end , ['lift'] = function (name, x, y) ... end , ['cols'] = cols, ['rows'] = rows, } else arcs[name] = { ... } end end



and the c event callback pushes the name along with (x,y) &c.

then it is easy to add convenience stuff in lua to sort, store, set defaults...
tehn commented 7 years ago

this sounds great.

a couple requests:

of course i'll go into more detail, but i wanted to ensure you didn't dive in with a bunch of work that i/we might be undoing later

On Fri, Mar 10, 2017 at 5:38 PM, ezra buchla notifications@github.com wrote:

ok, there is enough working to do the following, should be a quick task (i'll get to it a little later)

-

set up tty monitor, filter on usb-serial. on add, send the device path to monome_open and start an event loop thread. on remove, stop the corresponding thread. (this is exactly what serialosc does, but we don't need separate child processes for everything.)

after monome_open we have some device metadata, which we can send to lua. lua keeps a table something like

grids = {} arcs = {}

add_device = function( name, type, cols, rows...) if type = grid then monomes[name] = { ["press"] = function (name, x, y) ... end , ["lift"] = function (name, x, y) ... end , ["cols"] = cols, ["rows"] = rows, } else arcs[name] = { ... } end end

and the c event callback pushes the name along with (x,y) &c.

then it is easy to add convenience stuff to sort, store, set defaults...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/catfact/norns/issues/11#issuecomment-285804573, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPEcObyfwp8RdkbnVwPsK17-gy-TFZxks5rkdDsgaJpZM4MTq45 .

tehn commented 7 years ago

to elaborate on the last point (or perhaps simplify) a good start would just be to have access to a raw array of bytes for the grid LEDs, and then a update function.

this has of course gotten off-topic from the original issue, i apologize

catfact commented 7 years ago

see PR #23

catfact commented 7 years ago

since this issue thread is very long and is about hotplugging / device managemetn, i'm gonna close it. the infrastructure for those features seems solid now. i haven't actually added arc handlers (i've no way to test them), but that should be straightforward.