[FR] Ability to run inkvt concurrently with other programs (e.g. plato)

rien333 commented 4 years ago

One thing I miss from fbpad-eink is the ability to run a terminal session along with Plato (i.e. ereader software). I gather that this ability was not really your intention to implement, but I still miss it.

With fbpad-eink, I could fairly easily switch between, say, a book, and a terminal: if I would type within fbpad or refresh the screen with ctrl+l, I would see a terminal, and if I would then refresh the screen in Plato using touch gestures, I would return to the book. It didn't work flawlessly, but it was interesting and somewhat usable nonetheless.

inkvt can already sort of do this, but as we discussed in an earlier issue (https://github.com/llandsmeer/inkvt/issues/5), sometimes my kobo just freezes when both Plato and inkvt are running.

If you want, I would be happy to provide debugging info (my inkvt session in the terminal outputs nothing upon freezing, but dmesg repeatedly shows imx_epdc_v2_fb 20f4000.epdc: collision detected, can not do REAGl/-D)

My own use cases for this would be controlling my music from my kobo (I use cmus, but I don't like using my computer while reading, because it often distracts me. Currently, ssh'ing and then running cmus already works fine, but often freezes after a while). Moreover, it might be nice to take notes while reading (e.g. using ssh+emacs).

Maybe there is a way to just completely steal the screen from Plato, and then return it when inkvt quits? (as a sort of hacky solution to any Plato-inkvt interference).

Not sure how difficult this would be, but maybe someone, someday will find the time to implement this.

llandsmeer commented 4 years ago

I thought that part of the solution is already implemented (eg . see a531c6a51371ef6198f6e1ba4c92613a7e683965). If you call inkvt without --no-reinit, it should run fine on top of plato. Are you using the lastest version?

If it still crashes with the latest version, something is definitely wrong. Maybe one 'trivial' solution, is sending SIGSTOP to Plato on start and SIGCONT on exit via kill. That should prevent Plato from messing with the framebuffer state, but maybe it corrupts the event queues.

For saving the screen on start, and restoring on exit, that's something that's not implemented yet. A simple memcpy would do for simple cases, but with Plato changing the screen orientation in the background some more logic should be implemented. I'm kind of busy right now, but maybe that's something I can do in the future.

Thanks for the feature request. It would definitely be a nice upgrade :)

rien333 commented 4 years ago

If you call inkvt without --no-reinit, it should run fine on top of plato. Are you using the lastest version?

Yes, both programs are fully updated. I thought I tried --no-reinit, but let me try again and see what difference that makes.

If it still crashes with the latest version, something is definitely wrong. Maybe one 'trivial' solution, is sending SIGSTOP to Plato on start and SIGCONT on exit via kill. That should prevent Plato from messing with the framebuffer state, but maybe it corrupts the event queues.

Gonna try this one too.

rien333 commented 4 years ago

--no-reinit doesn't see to make that much of a difference. Freezes occur exactly as before, and there are no special messages in dmesg, only the recurrent imx_epdc_v2_fb 20f4000.epdc: collision detected, can not do REAGl/-D (which also happens without --no-reinit)

Maybe one 'trivial' solution, is sending SIGSTOP to Plato on start and SIGCONT on exit via kill

I tried this, and although plato handled the SIGSTOP (i.e. kill -19 PID, righ?) fine, running inkvt --no-reinit on top eventually still caused a freeze.

rien333 commented 4 years ago

Okay, fairly stupid, but I think this could very well be a bug in plato iself? (see https://github.com/baskerville/plato/issues/129). I always assumed it came from inkvt, though, idk why. This bug in plato has been introduced around the time of this issue, and it's profile fits this issue quite well. It's also somewhat hard to trigger, in the sense that it requires some conditions to be met, which might be why I only observed the freeze in this issue after using inkvt for a bit.

rien333 commented 4 years ago

nvm, this issue still seems to happen, though it's maybe related to said issue in plato.

NiLuJe commented 4 years ago

You've failed to mentioned which device you're using (or you did in the previous issue and I've since forgotten ;p).

But, yeah, on Mk. 7, here be dragons when running @ 8bpp and rota 0 (and possibly, other non-standard rotations).

Given the update patterns sent, InkVT has far more chances of triggering EPDC deadlocks this way than Plato ever did ;).

(I've still never managed to get a shell during the rare times I can kill my Forma this way, so, I'm still not entirely sure what those are, exactly, especially since Mk. 7 kernels should include the TCE underrun protection patches, so those might actually be harder crashes than what I originally suspected).

rien333 commented 4 years ago

I'm using the Kobo Clara HD. The firmware I have running on it is quite old, actually. Summer 2019, when I bought it.

NiLuJe commented 4 years ago

Ah, that's also potentially an issue, as there have been numerous kernel fixes over the years (especially around and after the Forma's release).

i.e., the latest (and probably final, knowing NTX...) Forma kernel was built in December 2019.

rien333 commented 3 years ago

Ah, that's also potentially an issue, as there have been numerous kernel fixes over the years

I've just upgraded to 4.25.15875 (the newest firmware version as of now), and inkvt froze after using it for a minute or so (I've observed the freeze both in main menu while in portrait mode, and while opening a pdf in landscape mode).

Like before (#5), inkvt outputted something like this in the terminal upon freezing:

[FBInk] MXCFB_SEND_UPDATE_V2: Invalid argument!
[FBInk] update_region={top=8, left=0, width=1264, height=1664}!
[FBInk] Failed to refresh the screen!

The only outstanding message (non-RTL871X related) in dmesg is:

imx_epdc_v2_fb 20f4000.epdc: Timed out waiting for update completion

This dmesg line appears everytime I try and run ./fbink 'hello world' after inkvt has freezed.

Compared to #5, there's no emission of imx_epdc_v2_fb 20f4000.epdc: collision detected, can not do REAGl/-D anymore.

NiLuJe commented 3 years ago

Well, that's consistent with a crashed EPDC ;).

What's your device, again (EDIT: Clara, duh)? And how was InkVT launched, and what else is currently running? And what was the context for the log snippet you posted (rotation & bitdepth).

I can't actually reproduce this on a Forma, but you might see if a ./fbink -c -W INIT manages to get through to it. (Otherwise, it might revive after a full powerdown of the EPDC, but I don't think there's an explicit ioctl for that, only MXCFB_GET_PWRDOWN_DELAY).

NiLuJe commented 3 years ago

Also mildly concerned about what exactly you were doing, because width=1264, height=1664 obviously doesn't fit a Clara ;p.

NiLuJe commented 3 years ago

And, in the off-chance the region overflow is not a red-herring (after all, the only other definitely reproducible EPDC deadlocks I've ever found were related to empty/tiny regions), I'd like to see why it's happening in the first place.

As it's probably a SNAFU in FBInk's grid_to_region's insanity, a verbose build might help:

diff --git a/src/vterm.hpp b/src/vterm.hpp
index a601aa0..913f492 100644
--- a/src/vterm.hpp
+++ b/src/vterm.hpp
@@ -490,8 +490,8 @@ public:
         fbink_init(fbfd, &config);
         fbink_cls(fbfd, &config, nullptr);
         fbink_get_state(&config, &state);
-        config.is_quiet = true;
-        config.is_verbose = false;
+        config.is_quiet = false;
+        config.is_verbose = true;
         fbink_update_verbosity(&config);

         // None of the dithering mechanisms deal very well with tiny refresh regions, so,

NiLuJe commented 3 years ago

Fair warning: invest in a very large terminal scroll log history, because it's going to be super verbose ^^.

rien333 commented 3 years ago

And how was InkVT launched?

First, I launch Plato from the nickel (stock) homescreen. Then, I login into my KOBO using ssh and run inkvt (with plato still being still active). I've tried launching inkvt with and without --no-reinit, but that doesn't make much of a difference — inkvt and my whole screen freezes after using it for some amount of time.

And what else is currently running?

Looking at htop, the programs that are running after inkvt has freezed include Plato, kfmon, ssh (dropbear), inkvt itself (it doesn't crash, it only freezes, see #5), and some system services (wpa_supplicant, dhcpcd, dbus-daemon, etc.). inkvt works fine when launching it from the nickel homescreen, the point of this issue is to make it work alongside Plato.

FWIW, plato is updated to the newest version as well.

And what was the context for the log snippet you posted (rotation & bitdepth).

Is this what you mean with context? This is the output after I run inkvt. Note that the freeze occurs regardless of my device's orientation.

[root@kobo inkvt]# ./inkvt.armhf --no-http --no-timeout --no-reinit
[FBInk] Detected a Kobo Clara HD (376 => Nova @ Mark 7)
[FBInk] Enabled Kobo Mark 7 quirks
[FBInk] Clock tick frequency appears to be 100 Hz
[FBInk] Screen density set to 300 dpi
[FBInk] Variable fb info: 1072x1448, 32bpp @ rotation: 3 (Counter Clockwise, 270°)
[FBInk] Fontsize set to 16x32 (Terminus base glyph size: 8x16)
[FBInk] Line length: 67 cols, Page size: 45 rows
[FBInk] Horizontal fit is perfect!
[FBInk] Vertical fit isn't perfect, shifting rows down by 4 pixels
[FBInk] Fixed fb info: ID is "mxc_epdc_fb", length of fb mem: 6782976 bytes & line length: 4352 bytes
[FBInk] Pen colors set to #000000 for the foreground and #FFFFFF for the background

Looking at what FBINK reports the rotation to be at, there's something odd happening perhaps? Pinging @llandsmeer Say I start Plato, then my device will start out with:

$ ./inkvt.armhf --no-http --no-timeout --no-reinit
[FBInk] Variable fb info: 1072x1448, 32bpp @ rotation: 3 (Counter Clockwise, 270°)

If I then close out inkvt, open a pdf in PLato, and go into 90° rotation using Plato's main menu, I'll get the following rotation ouput:

$ ./inkvt.armhf --no-http --no-timeout --no-reinit
[FBInk] Variable fb info: 1448x1072, 32bpp @ rotation: 0 (Upright, 0°)

I can't actually reproduce this on a Forma

In my experiece, most freezes occur during heavy, fast drawing. I can quickly trigger a crash if I do something as follows:

Login into my main machine from inkvt
Start a tmux session with an ncurses program in one window (htop, cmus), and cmatrix in another window (C-b c in tmux). The nice thing about cmatrix is that it tries to constantly fill and update your screen with new characters. It can also try to draw at really fast speeds, esp. if you press the 1 key. If you press r, it also uses a flashy rainbow effect. In my observations, such rapid updates make inkvt more prone to freeze.
Play around with tmux, cmatrix, and the ncurses program for a while (shouldn't be much more than a minute in my experience). As you probably know, you can switch between windows in tmux with C-b n. I tend to observe crashes either when cmatrix is running, or when switching back and forth tmux windows.

you might see if a ./fbink -c -W INIT

Still frozen, unfortunately.

Also mildly concerned about what exactly you were doing, because width=1264, height=1664 obviously doesn't fit a Clara ;p.

Yeah, odd. tbh, that log was lifted from #5, which is from a while back. I did observe that exact error/warning message, but lost the exact contents, so I just pasted with I from #5. Sorry for the noise.

Speaking of warnings and error messages, after numerous attempts, I'm not seeing this message after freezes anymore (iirc, I did see this most of the time yesterday):

[FBInk] MXCFB_SEND_UPDATE_V2: Invalid argument!
[FBInk] update_region={top=8, left=0, width=1264, height=1664}!
[FBInk] Failed to refresh the screen!

Inexplicably, compared to yesterday, the following messages have returned to dmesg:

imx_epdc_v2_fb 20f4000.epdc: TCE underrun! Will continue to update panel
imx_epdc_v2_fb 20f4000.epdc: collision detected, can not do REAGl/-D

A verbose build might help

I'll try this later today!

NiLuJe commented 3 years ago

You can't run under Plato with --no-reinit. That'll obviously break at some point. Because as you've noticed, Plato will rotate. If you don't let InkVT react to that, stuff breaks.

A TCE underrun is another known errata of the EPDC that miiiight be worked around on some kernels, but might not.

The collisions, on the other hand, are completely "normal", given the refresh patterns done by InkVT (e.g., even refreshing a single line, you'd most likely trip a collision check at least once)..

rien333 commented 3 years ago

Hm, noted. Thanks! Also, I noticed that not everything (e.g. fbink) was as recent as it could be (though it was). I'll try a fresh build before testing further.

rien333 commented 3 years ago

A completely up to date build didn't fix anything, unfortunately.

As it's probably a SNAFU in FBInk's grid_to_region's insanity, a verbose build might help:

That's a verbose build indeed. Anything I should be looking for? Just any non-ordinary lines? So far, most lines fit the same pattern. The last lines before the crash were:

Char 1 out of 1 is @ byte offset 0 and is U+0020 ( ) 
# then it tries to draw another space
waveform_mode is now 0x101 (AUTO)
[FBInk] Foreground pen color set to #7F7F7FFF (grayscaled: #7F)
[FBInk] Background pen color set to #1F1F1FFF (grayscaled: #1F)
Adjusted position: column 41, row 2
Final position: column 41, row 2
Need 1 lines to print 1 characters over 26 available columns
Line 1 (of ~1), previous line was 0 characters long and there were 1 characters left to print
Characters to print: 1 out of the 1 remaining ones
Line takes up 1 bytes
snprintf wrote 1 bytes
Printing ` ` @ line offset 0 (meaning row 2)
Character count: 1 (over 1 bytes)
Adjusting vertical pen position by 0 pixels, as requested, plus 4 pixels, as mandated by the native viewport
Region: top=68, left=656, width=16, height=32
Char 1 out of 1 is @ byte offset 0 and is U+0020 ( )

I could also send you a complete log of everything up until the freeze.

NiLuJe commented 3 years ago

Yep, a full log might help ;).

NiLuJe commented 3 years ago

(Because at least the final refresh is perfectly innocuous there ;)).

Also, just to double-check, do extra, manual fbink calls still trigger a Timed out waiting for update completion kernel log? (the timeout is set at 5s, although that one is internal, so it may be longer or shorter).

rien333 commented 3 years ago

Also, just to double-check, do extra, manual fbink calls still trigger a Timed out waiting for update completion kernel log?

Actually, if I leave my kobo frozen dmesg outputs Timed out waiting for update completion every few (3-6) seconds or so, so it's kinda difficult to see if that message is actually related to my fbink call (dmesg doesn't emit any timestamps for me). Still, after some testing and fiddling with dmesg, I'm almost certain calling fbink does still trigger the "Timed out" message (as you seem to expect).

rien333 commented 3 years ago

Yep, a full log might help ;).

While running sort -u on this file seemingly outputs no odd lines, maybe you'll see something (see the ZIP file attached below).

Note that right before the line [1]+ Stopped ./inkvt.armhf --no-http --no-timeout, I pressed C-z. At that point, all output from inkvt had suddenly stopped and the UI froze (i.e. the freeze happened right after the line Region: top=36, left=768, width=16, height=32).

This dmesg log was generated from running cmatrix (and htop, IIRC?) under tmux.

log.zip

NiLuJe commented 3 years ago

Assuming you've got my bundle of crap installed, you can run klogd once so that the kernel log ends up in the syslog, where it'll magically gain sentience^Wtimestamps ;p.

NiLuJe commented 3 years ago

Which reminds me that we can also send FBInk's log to the syslog, which might be moderately helpful in this instance...

diff --git a/src/vterm.hpp b/src/vterm.hpp
index a601aa0..0ee202c 100644
--- a/src/vterm.hpp
+++ b/src/vterm.hpp
@@ -487,11 +487,12 @@ public:
         }
         config.fontname = get_font(fontname);
         config.fontmult = fontmult;
+        config.to_syslog = true;
         fbink_init(fbfd, &config);
         fbink_cls(fbfd, &config, nullptr);
         fbink_get_state(&config, &state);
-        config.is_quiet = true;
-        config.is_verbose = false;
+        config.is_quiet = false;
+        config.is_verbose = true;
         fbink_update_verbosity(&config);

         // None of the dithering mechanisms deal very well with tiny refresh regions, so,

EDIT: Oh, and for the manual calls via the CLI frontend, just pass -G.

NiLuJe commented 3 years ago

Because yeah, nothing really out of the ordinary in that log ;).

NiLuJe commented 3 years ago

Also, just in case: is Plato doing anything during the inkvt session?

rien333 commented 3 years ago

Also, just in case: is Plato doing anything during the inkvt session?

Nope, I can trigger the freeze while leaving Plato "idle" (i.e. not directly interacting with it). Plato does of course update the battery status, digital clock, and perhaps other background things.

Anecdotally, I've found that (heavily) interacting with Plato (sending touch evens and the like) does increase the probability of a freeze occurring.

NiLuJe commented 3 years ago

Might be a marker collision between the two then... It's exceedingly unlikely/unlucky, but given the high rate of refreshes here, eeeeh, maybe.

I don't quite recall how Plato numbers update markers, but it's liable to do it the exact same way as FBInk: 1 to uint32_max.

On Thu, Jan 7, 2021, 12:52 Rijnder Wever notifications@github.com wrote:

Also, just in case: is Plato doing anything during the inkvt session?

Nope, I can trigger the freeze while leaving Plato "idle" (i.e. not directly interacting with it). Plato does of course update the battery status, digital clock, and perhaps other background things.

Anecdotally, I've found that (heavily) interacting with Plato (sending touch evens and the like) does increase the probability of a freeze occurring.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/llandsmeer/inkvt/issues/9#issuecomment-756070456, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA3KZTKZQVV27G32BUM57LSYWOAHANCNFSM4NQBVD7Q .

NiLuJe commented 3 years ago

Except FBInk actually seeds the first marker with the PID instead of starting at 1, which makes this even unlikelier...

On Thu, Jan 7, 2021, 12:58 NiLuJe ninuje@gmail.com wrote:

Might be a marker collision between the two then... It's exceedingly unlikely/unlucky, but given the high rate of refreshes here, eeeeh, maybe.

I don't quite recall how Plato numbers update markers, but it's liable to do it the exact same way as FBInk: 1 to uint32_max.

On Thu, Jan 7, 2021, 12:52 Rijnder Wever notifications@github.com wrote:

Also, just in case: is Plato doing anything during the inkvt session?

Nope, I can trigger the freeze while leaving Plato "idle" (i.e. not directly interacting with it). Plato does of course update the battery status, digital clock, and perhaps other background things.

Anecdotally, I've found that (heavily) interacting with Plato (sending touch evens and the like) does increase the probability of a freeze occurring.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/llandsmeer/inkvt/issues/9#issuecomment-756070456, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA3KZTKZQVV27G32BUM57LSYWOAHANCNFSM4NQBVD7Q .

rien333 commented 3 years ago

Not sure if he feels like looking into this, but maybe @baskerville has something to add. (I'll look into your other comments in a bit)

NiLuJe commented 3 years ago

Otherwise, it's pretty much a hardware issue, apparently the Clara is super finicky (and/or just popular), because for example the 8bpp issues encountered in Plato were mostly harmless on my Forma ;).

On Inkvt's side, damage handling could be tweaked to batch more things together (e.g., line by line instead of cell by cell), which would cut down on the amount of refreshes significantly. The high throughput threshold that's currently hard-coded also probably ought to be dynamic and based on the actual current grid size, so that, say, anything that attempts to refresh > 70% of the grid trips it so that stuff switches to grid size refreshes.

On Thu, Jan 7, 2021, 13:08 Rijnder Wever notifications@github.com wrote:

Not sure if he feels like looking into this, but maybe @baskerville https://github.com/baskerville has something to add.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/llandsmeer/inkvt/issues/9#issuecomment-756077667, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA3KZS6UGWZW53LST2CFMTSYWP3LANCNFSM4NQBVD7Q .

NiLuJe commented 3 years ago

Or deal with damage somewhat like KOReader and Plato, i.e., collect it and coalesce it manually, and only actually send coalesced refreshes (e.g., an UI loop).

On Thu, Jan 7, 2021, 13:18 NiLuJe ninuje@gmail.com wrote:

Otherwise, it's pretty much a hardware issue, apparently the Clara is super finicky (and/or just popular), because for example the 8bpp issues encountered in Plato were mostly harmless on my Forma ;).

On Inkvt's side, damage handling could be tweaked to batch more things together (e.g., line by line instead of cell by cell), which would cut down on the amount of refreshes significantly. The high throughput threshold that's currently hard-coded also probably ought to be dynamic and based on the actual current grid size, so that, say, anything that attempts to refresh > 70% of the grid trips it so that stuff switches to grid size refreshes.

On Thu, Jan 7, 2021, 13:08 Rijnder Wever notifications@github.com wrote:

Not sure if he feels like looking into this, but maybe @baskerville https://github.com/baskerville has something to add.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/llandsmeer/inkvt/issues/9#issuecomment-756077667, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA3KZS6UGWZW53LST2CFMTSYWP3LANCNFSM4NQBVD7Q .

rien333 commented 3 years ago

On Inkvt's side, damage handling could be tweaked to batch more things together

@llandsmeer I feel like decreasing the likelihood of this freeze occurring with some tweaks will actually help me/Kobo Clara users a great deal, as I've noticed that the freeze can be hard to produce.

llandsmeer commented 3 years ago

Hi, sorry; As you might have have noticed I haven't been exactly busy with the project lately (and I don't have the extensive E-ink knowledge @NiLuJe has, so I do not understand the technical reasons why the freeze happens). The most sane solution I can think of is SIGSTOPing plato on start, SIGCONTing on exit (maybe just via the bash script so a crash in inkvt doesn't matter too much). We're lucky here that Plato doesn't lock the touchscreen input.

Coalescing and outputting at max fps will also help a lot, that's probably a bit of tweaking of the constants in src/vterm.hpp.

I do not have a Kobo Clara lying around, so I'm not exactly in the position to develop a Kobo Clara specific solution; but I can help if you want to dig around in the code yourself.

@NiLuJe thanks a lot again for the support!

llandsmeer / inkvt

[FR] Ability to run inkvt concurrently with other programs (e.g. plato) #9