pgaskin / NickelMenu

The easiest way to launch scripts, change settings, and run actions on Kobo e-readers.
https://pgaskin.net/NickelMenu
MIT License
556 stars 28 forks source link

Implement config file reloading and improve the config parser #47

Closed pgaskin closed 4 years ago

pgaskin commented 4 years ago

This PR implements config file reloading.

Every time the menu is opened, the config files are scanned. If the number of files or the modification time of one changes, they are re-parsed and re-generated. If there is a config error or I/O error while parsing or scanning the files, an item is added showing it like usual. If there aren't any config entries, the usual one referring to the documentation is shown.

There should not be any noticeable delay in opening the menu, except around another 200ms (up to 500ms) when opening it after a config change.

closes #42


For the config parser improvements, see https://github.com/geek1011/NickelMenu/pull/47#issuecomment-646444323.

closes #48

pgaskin commented 4 years ago

This is ready for testing.

In particular, I'd like someone to try and break the update mechanism (try and make it incorrectly detect an update or lack thereof, also ensure the items are in the correct order, make sure it still works correctly after a USBMS session), I want to make sure I didn't leave any edge cases (no menu items, etc) unhandled, and I would like another pair of eyes on the pointer/memory handling in nm_global_* in init.c (to make sure I free things properly and swap things out under the right conditions).

In addition, you shouldn't be able to cause it to crash at any point (e.g. I tested that it wouldn't crash if the config files were modified while a menu was open and an item was pressed, and it worked fine because we only check for config updates when the menu is opened).

Also, @NiLuJe, can you ensure that your KFMon generator works with KFMon running, with KFMon changes (you'll need to touch a config file for now, but I'd be happy to hear ideas to do it automatically), without KFMon running, and with KFMon unresponsive?

P.S. A code review would also be appreciated.

pgaskin commented 4 years ago

I do have one idea for automatic updates for KFMon: In another PR, I could implement generator updates and the KFMon one could check the mtime of the socket (which KFMon can touch every time there is a change).

shermp commented 4 years ago

I'm trying to figure my way through the code, but I have to admit, first impressions are... convoluted. Mainly around the nm_global_config_update and nm_global_config_items

More thoughts when I can (hopefully) figure most things out.

shermp commented 4 years ago

The code in config.c seems fine to me. What perplexes me is the global config management. To me, it seems a mess of global functions operating on a mess of global variables.

I would have thought a better way would be to have a single global struct that contains all the required state, with the required updating functions updating it directly, preferably via a pointer parameter (to make it clear what's being modified).

The menu ejecting code could then just check needed flags, and any error messages etc.

What am I missing?

shermp commented 4 years ago

Just had another thought. You might need to go back through the config parser code and check for memory leaks, now that config parsing is no longer a one-time affair.

(I'm specifically thinking about returning on error without freeing the calloc'ed nm_menu_item_t variable.)

NiLuJe commented 4 years ago

I won't have time to give this a proper look-see until this week-end, but it's on top of the list ;).

pgaskin commented 4 years ago

What perplexes me is the global config management. To me, it seems a mess of global functions operating on a mess of global variables.

Yes, it is somewhat convoluted. I ended up on the current design after thinking of how I can keep pointers valid for as long as possible, free stuff as soon as possible, and be the most efficient.

The first thing I did was to split the file scanning from the config. The other option I was considering was to add a file/date field to each config item (or just the first one from each file), but I realized that would limit future flexibility, it isn't that clean, and nothing uses it (or should use it) except during the initial parsing. This led to nm_config_files which returns a nm_config_file_t, which, for efficiency (we append quite a bit and don't know the size ahead of time, but we loop in order), was a linked list.

Then, I was thinking about how to update it in a way which would allow returning errors while leaving the original one untouched, and also handle freeing the memory. This led to nm_config_files_update, which is relatively straightforward (note that the bit after the loop which checks op || np is to check if the sizes are different, as one would still point to an item after the loop exits while the other is null at the end of the list).

Afterwards, I had to figure out the best way to manage the config and errors with the least code duplication, specifically around the error items (I wanted to create those in one place during the parsing rather than in the menu for each error). This led to the 4 variables: nm_global_menu_config_files (for comparing against without affecting the config), nm_global_menu_config (so we can free it later), nm_global_menu_config_items (because we needed somewhere to store the pre-generated error items, which meant we couldn't just call nm_config_get_menu in the injection hook), and nm_global_menu_config_n (same reason). And, to be on the safe side, I made those private to init.c so other files can't touch it by accident, and added nm_global_config_items to get the items, and nm_global_config_update to update them.

For nm_global_config_update, I wanted to reduce the code duplication in error handling, and wanted to have the simplest conditions possible for freeing (i.e. avoid lots of nested if statements for each condition). This led to nm_global_config_replace, which just frees whatever will be replaced if it isn't NULL, and then replaces the items with either the error message or the new menu items. This also has the advantage of ensuring that there won't be any condition in which the menu items are invalid after the update function returns.

That's basically how I ended up with the current design.


I'll have a look at the config parsing itself and see if I can simplify the memory management in it (probably by making the struct on the stack first, then only allocating them memory on success rather than beforehand).

pgaskin commented 4 years ago

I've basically rewritten the config parser over the last 3 commits. It should be faster, more stable, more efficient, easier to extend, and won't leak memory anywhere (even when errors occur). The only downside is that it is twice the lines of code.

I've already done some tests with it, but while testing the config reloading, also try out different config errors to make sure they are still handled properly.

If you have any suggestions for improving the config parser further, I'd be happy to hear them.

NiLuJe commented 4 years ago

I'm hoping to have a large enough chunk of time w/ available brain cells to finally test this on Sunday afternoon, but at first glance I don't really have anything more to add than what @shermp said ;).

pgaskin commented 4 years ago

For the parser, it's mainly about looking for regressions (parsing and error handling) like the one with item order I just fixed. I don't think there are any memory leaks left, so don't focus as much on that.

For the config reloading, it's just making sure I didn't botch any conditions for the change detection.

Also, for KFMon change reloading, the idea I currently have (to be implemented in another PR) is to have a nm_generator_*_current(void* userdata) function or something similar which checks if the generator is up to date. For KFMon, I think the best way to do this might be to use utime to change the mtime of the socket, which can be checked efficiently from NM.

NiLuJe commented 4 years ago

For KFMon, I think the best way to do this might be to use utime to change the mtime of the socket, which can be checked efficiently from NM.

Yup, that's not a bad idea at all!

The only other thing that came to mind would be incrementing a counter for each config "revision", and return that over IPC via a new command (meaning an extra connect and send/read roundtrip in NM).

pgaskin commented 4 years ago

meaning an extra connect and send/read roundtrip in NM

I wanted to avoid blocking operations as much as I can, as this would be run every time the menu is opened before it shows.

NiLuJe commented 4 years ago

Just ran this through Clang's SA for shit'n giggles ;).

NiLuJe commented 4 years ago

That's my quick recap/understanding of the code flow, I may be wrong, there may be false positives to begin with ;).

If you have Clang installed, it's as easy as removing -Werror from the Makefile, and running make clean && scan-build make all CROSS_COMPILE=.

(Also, my Clang ICE'd on action_cc, which is probably not a huge surprise, given the shenanigans involved in there ;p).

NiLuJe commented 4 years ago

Here's the plain text 'end of path' warning for the dlhook stuff:

src/dlhook.c:81:36: warning: Access to field 'st_name' results in a dereference of a null pointer (loaded from variable 'sym')
        const char *str = &dyn.str[sym->st_name];
                                   ^~~~~~~~~~~~
src/dlhook.c:82:13: warning: Null pointer passed to 1st parameter expecting 'nonnull'
        if (strcmp(str, symname))
NiLuJe commented 4 years ago

(It also pointed out an actual SNAFU in the kfmon list code that I'll be fix'ing ;)).

NiLuJe commented 4 years ago

I apparently managed to push the kfmon fix on master, because I'm an idiot? >_<".

(Dunno what happened, I possibly missed a branch checkout failure or something :s).

pgaskin commented 4 years ago

That's fine (it's a single commit which isn't really release-note-worthy), just try not to do that in the future.

I'll look into those Clang mesages later. Apparently I only had the GCC one enabled one in my VSCode setup...I'll have to fix that (it probably changed when I reinstalled a few weeks ago).

pgaskin commented 4 years ago

In config.c:250; an error from nm_config_parse__line_item is never checked directly, which means nm_config_parse_append* can be called with bogus data in tmp_it/tmp_act.

That one's actually fine, since it just passes the error from nm_config_parse__lineend_action, which will either set the error or set tmp_act correctly.

failsafe.c:28; fs is not free'd on early ASSERTs (meh.).

That's deliberate, as I'd rather have a memory leak than another possible crash there.

nm_config_parse__append_generator; the second NM_CONFIG_PARSEAPPENDRET_ALLOC_ERROR doesn't free cfg_n & cfg_gn_n.

Oops (I changed the free rather than copy-pasting it first).

Same idea in nm_config_parse__append_item

Same thing.

nm_config_parseappend_action; cfg_it_act_n is not freed on either NM_CONFIG_PARSEAPPEND__RET_ALLOC_ERROR

Fixed.

nm_config_parse:254; state.cfg_c (as allocated by nm_config_parse__append_generator @ L269) not freed on error.

Idk what that one's about. It should be handled by the RETERR macro...

nm_config_files:38; cfc not freed on ASSERT (meh.).

I'll fix that (same for the other two NM_RETURN_ERRORs).

nm_action_result_free: use after free (res->msg is freed after res itself).

Fixed on master. I'm surprised res->msg has been null every time so far (or nobody's reported a crash).

It's not happy about the RETERRs in nm_config_parse (use after free).

Ohhh, so that's why I've been getting garbage occasionally when there's an invalid config type (the asprintf for the error happens after the line is freed, making the pointer into it as a format argument garbage). I'll fix that.

src/dlhook.c:81:36: warning: Access to field 'st_name' results in a dereference of a null pointer (loaded from variable 'sym')

That's a false positive, I think.

pgaskin commented 4 years ago

Done, @NiLuJe.

NiLuJe commented 4 years ago

Dinner break, and I'll finally have time to run some actual tests after that ;).

NiLuJe commented 4 years ago

@geek1011: In a somewhat similar vein as 6a7a5c9079420e3512e1a38fe82df9dda552caa6, generated items were inserted in reverse order (because the list is actually linked in reverse?).

Looping on the generated items in reverse put things back in order. (Ignore the extra logging, that was to double-check that I wasn't going mad, and that nm_generator_do returned stuff in the right order ;).

diff --git a/src/config.c b/src/config.c
index 2167605..38770ae 100644
--- a/src/config.c
+++ b/src/config.c
@@ -33,7 +33,7 @@ nm_config_file_t *nm_config_files(char **err_out) {

     for (int i = 0; i < n; i++) {
         struct dirent *de = nl[i];
-        
+
         char *fn;
         if (asprintf(&fn, "%s/%s", NM_CONFIG_DIR, de->d_name) == -1)
             fn = NULL;
@@ -117,7 +117,7 @@ bool nm_config_files_update(nm_config_file_t **files, char **err_out) {
     bool ch = false;
     nm_config_file_t *op = *files;
     nm_config_file_t *np = nfiles;
-    
+
     while (op && np) {
         if (strcmp(op->path, np->path) || op->mtime.tv_sec != np->mtime.tv_sec || op->mtime.tv_nsec != np->mtime.tv_nsec) {
             ch = true;
@@ -596,7 +596,8 @@ void nm_config_generate(nm_config_t *cfg) {
             }

             NM_LOG("config: ... %zu items generated", sz);
-            for (size_t i = 0; i < sz; i++) {
+            for (ssize_t i = sz - 1; i >= 0; i--) {
+                NM_LOG("Looping on generated item %zd (%s)", i, items[i]->lbl);
                 nm_config_t *tmp = calloc(1, sizeof(nm_config_t));
                 tmp->type = NM_CONFIG_TYPE_MENU_ITEM;
                 tmp->value.menu_item = items[i];
diff --git a/src/generator_c.c b/src/generator_c.c
index a113e29..29ecad1 100644
--- a/src/generator_c.c
+++ b/src/generator_c.c
@@ -72,6 +72,7 @@ NM_GENERATOR_(kfmon) {
     // Walk the list to populate the items array
     size_t i = 0;
     for (kfmon_watch_node_t *node = list.head; node != NULL; node = node->next) {
+        NM_LOG("Adding %s at item index %zu", node->watch.label, i);
         items[i] = calloc(1, sizeof(nm_menu_item_t));
         items[i]->action = calloc(1, sizeof(nm_menu_action_t));
         items[i]->lbl = strdup(node->watch.label);
NiLuJe commented 4 years ago

Or, perhaps less confusingly, without looping in reverse:

diff --git a/src/config.c b/src/config.c
index 2167605..e11f6dc 100644
--- a/src/config.c
+++ b/src/config.c
@@ -33,7 +33,7 @@ nm_config_file_t *nm_config_files(char **err_out) {

     for (int i = 0; i < n; i++) {
         struct dirent *de = nl[i];
-        
+
         char *fn;
         if (asprintf(&fn, "%s/%s", NM_CONFIG_DIR, de->d_name) == -1)
             fn = NULL;
@@ -117,7 +117,7 @@ bool nm_config_files_update(nm_config_file_t **files, char **err_out) {
     bool ch = false;
     nm_config_file_t *op = *files;
     nm_config_file_t *np = nfiles;
-    
+
     while (op && np) {
         if (strcmp(op->path, np->path) || op->mtime.tv_sec != np->mtime.tv_sec || op->mtime.tv_nsec != np->mtime.tv_nsec) {
             ch = true;
@@ -596,14 +596,21 @@ void nm_config_generate(nm_config_t *cfg) {
             }

             NM_LOG("config: ... %zu items generated", sz);
+            nm_config_t * const next = cur->next;
+            // We'll want to link the first generated node to the generator node
+            nm_config_t *prev = cur;
             for (size_t i = 0; i < sz; i++) {
                 nm_config_t *tmp = calloc(1, sizeof(nm_config_t));
                 tmp->type = NM_CONFIG_TYPE_MENU_ITEM;
                 tmp->value.menu_item = items[i];
                 tmp->generated = true;
-                tmp->next = cur->next;
-                cur->next = tmp;
+                // Link the current generated node to the previous node
+                prev->next = tmp;
+                prev = tmp;
             }
+            // Link the next node to the final generated node
+            prev->next = next;
+
             free(items);
         }
     }

(It, err, took me an inordinate amount of time to grok that the goal was not to replace the generator node with generated nodes, but simply to insert them after the generator node they're attached to...).

NiLuJe commented 4 years ago

I don't recall how that was handled before, but calling an invalid action (i.e., typo in the action field) doesn't generate an error, just a menu entry that silently does nothing.

EDIT: Yep, there used to be an X macro to validate the action field.

EDIT²: And the generator field, too.

NiLuJe commented 4 years ago

Apart from that, no adverse effects with the KFMon generator.

If error there is, the sole matching error entry is created and inserted properly. KFMon not being up is an immediate error (connect fails). Otherwise, the timeout is currently set at 2s, and behaves properly.


Also tried inserting/deleting the test generator, no issues there.

NiLuJe commented 4 years ago

Nothing untoward happened with USBMS sessions.

I did get a forced update after one without me remembering if I'd actually touched something, but I couldn't reproduce it, so, probably forgot I did ;p.

I've added more verbose logging on that front in case it happens again:

diff --git a/src/config.c b/src/config.c
index 2167605..72f64b7 100644
--- a/src/config.c
+++ b/src/config.c
@@ -1,6 +1,7 @@
-#define _GNU_SOURCE // asprintf
+#define _GNU_SOURCE // asprintf, basename
 #include <dirent.h>
 #include <errno.h>
+#include <libgen.h>
 #include <stdbool.h>
 #include <stddef.h>
 #include <stdio.h>
@@ -33,7 +34,7 @@ nm_config_file_t *nm_config_files(char **err_out) {

     for (int i = 0; i < n; i++) {
         struct dirent *de = nl[i];
-        
+
         char *fn;
         if (asprintf(&fn, "%s/%s", NM_CONFIG_DIR, de->d_name) == -1)
             fn = NULL;
@@ -111,16 +112,18 @@ bool nm_config_files_update(nm_config_file_t **files, char **err_out) {

     if (!*files) {
         *files = nfiles;
+        NM_LOG("Initial config parsing");
         NM_RETURN_OK(true);
     }

     bool ch = false;
     nm_config_file_t *op = *files;
     nm_config_file_t *np = nfiles;
-    
+
     while (op && np) {
         if (strcmp(op->path, np->path) || op->mtime.tv_sec != np->mtime.tv_sec || op->mtime.tv_nsec != np->mtime.tv_nsec) {
             ch = true;
+            NM_LOG("Config file `%s` was updated!", basename(np->path));
             break;
         }
         op = op->next;
@@ -130,6 +133,7 @@ bool nm_config_files_update(nm_config_file_t **files, char **err_out) {
     if (ch || op || np) {
         nm_config_files_free(*files);
         *files = nfiles;
+        NM_LOG("Config files updated (ch? %d | op: %p | np: %p)", ch, op, np);
         NM_RETURN_OK(true);
     } else {
         nm_config_files_free(nfiles);
NiLuJe commented 4 years ago

Also confirmed that commenting everything properly leads to the creation of the default entry ;).

NiLuJe commented 4 years ago

And that's it for tonight, I think ;).

(I've also implemented and tested the mtime-on-kfmon-socket thing on KFMon's side, and that works beautifully :). Also made the config order alphabetical there, too).

NiLuJe commented 4 years ago

Also, random easy refactor idea: move the config filtering to the actual scandir filter feature? (That essentially means a smaller alloc for nl).

(And get rid of the logging on . & .. in the process ;)).

pgaskin commented 4 years ago

generated items were inserted in reverse order (because the list is actually linked in reverse?).

Yes, thanks for catching that. The list used to be linked in reverse until I refactored the config parser. Then, that made the menu item adapter go in the wrong order, which, as you noticed, I fixed. But, I tested the generators before that fix, making me forget about the generator code.

I'll probably apply your first fix, as I feel it's a bit more concise than the second.

Apart from that, no adverse effects with the KFMon generator. ...

:+1:

Also made the config order alphabetical there, too

I noticed. Is there a particular case which is broken with alphasort that made you write an adapter for strcmp instead?

Also, random easy refactor idea: move the config filtering to the actual scandir filter feature? (That essentially means a smaller alloc for nl).

I can't remember why I decided against it, but upon looking at it again, I don't see why not.


Did you happen to test the config errors yet?

NiLuJe commented 4 years ago

Did you happen to test the config errors yet?

Hmm, which config errors? ^^

EDIT: The Clang SA stuff? If so, yep, the fixes did the trick, and I think I agree with you on the false-positives left ;).

NiLuJe commented 4 years ago

I noticed. Is there a particular case which is broken with alphasort that made you write an adapter for strcmp instead?

alphasort takes pointers to a dirent pointer, I'm using the BSD fts_* API instead of scandir (for, err, some reason? :D), so the function needs to take pointers to an FTSENT pointer instead ;).

pgaskin commented 4 years ago

Hmm, which config errors? ^^

The usual ones (just to make sure I didn't mess them up during the refactor). I've already tested some of them.

NiLuJe commented 4 years ago

The usual ones (just to make sure I didn't mess them up during the refactor). I've already tested some of them.

Besides the action/generator fields no longer being validated, I haven't noticed anything new.

Bogus args are handled on a per-action basis, IIRC, so that shouldn't have been affected. I tested the empty config case, that worked.

Not quite sure what else could go wrong ;).

pgaskin commented 4 years ago

I've tested the error handling with:

chain_success:dbg_syslog:bad
bad
bad:bad
menu_item:bad
menu_item:main:bad
menu_item:main:test:bad
menu_item:main:test:dbg_syslog:ok
chain_bad:bad
chain_success:bad
chain_success:dbg_syslog:ok
generator:bad
generator:main:kfmon
chain_success:dbg_syslog:bad
menu_item:main:test:dbg_syslog:ok
chain_failure:dbg_syslog:ok
chain_success:dbg_syslog:ok
chain_always:dbg_syslog:ok

I deleted the first line each time I tested it.

pgaskin commented 4 years ago

I think this is pretty much ready now. The parser emits errors properly, and I think I've fixed all regressions from the refactor. I've also tested all the examples and my own config. I'll probably merge this, and I might post a beta version on MR later this week for testing.

NiLuJe commented 4 years ago

Another random thought: at boot, config parsing is bracketed between the failsafe mechanism.

That's no longer the case when updating configs. Should we care?

(I say that because I did manage to busy-loop nickel when breaking the linked list stuff ^^).

pgaskin commented 4 years ago

That's no longer the case when updating configs. Should we care?

I originally thought of that (hence the now-reverted instant failsafe destruction commit), but I decided against it, as it would mean I need a mutex to ensure the failsafe isn't already running, and if it was going to crash, it would do it during the initial parsing, which would trigger the failsafe. And, if it did happen to cause issues only after the first parse (i.e. when the menu is opened), NM can be uninstalled manually by connecting via USB or telnet without using the menu. Does this seem reasonable?

NiLuJe commented 4 years ago

@geek1011: Nope, sounds good.

In any case, if an uncaught bug does cause an issue during a config update, Nickel will softlock, which means sickel will reboot the device. Since the device was essentially unusable, the configs won't have been touched, at which point they'll trigger the exact same issue during the newt boot, except this time it'll be during the failsafe window, so, everybody wins ;).

It essentially boils down to simply needing two reboots instead of one to "fix" it by self-destruct.

EDIT: Which is what you just said :D. It's late ^^.