kokoye2007 / waitzar

Automatically exported from code.google.com/p/waitzar
Other
0 stars 1 forks source link

Config Manager Code Cleanup #189

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Preparing for version 1.9, we need to clean up the configuration file loader. 

The disorganization of this code stems from the fact that I didn't really know 
entirely what I wanted when I implemented the config parser. Roughly speaking 
this is how config files are parsed now:
  1) All config files are detected and their paths are saved into memory.
  2) All files are loaded, in order of importance. Conflicting settings are overwritten. Most complex settings are deferred; their options saved. There is some "security" in preventing, say, the local config file from adding a DLL. 
  3) Any complex settings (like languages) are loaded using factory methods. A static tree of configuration objects is constructed. 
  4) The entire tree is validated; check if requested encodings are available.
  5) The code can now read this tree. It can, very rarely, overwrite or temporarily disable options. 

What should actually happen is something like this:
  1) Load and read all files. Have a more general way of representing security measures (like a decorated grammar). 
    --> This produces a "map of strings" model.
  2) Create all config objects. Use factory methods.
    --> This produces a series of nested classes. Factory-produced config objects are directly useful (like "Transformation", etc.).
  3) Validate the config objects. Allow "optional" components (like DLLs) to fail and cascade that fail. Do not prune the tree; rather, state these dependencies. So, Ayar relies on Ayar2Uni, which relies on the Javascript extension.
    --> This produces some kind of tree, annotated with error messages, which shows us which components were loaded and which were skipped. (If a critical component was skipped, crash & load the embedded config file.)
  4) At run-time, if there is any kind of error, update this decorated tree with more error information. Allow "temporary" crashes (like exceptions at run-time) to be reset. 

The tree enables us to, for example, catch an exception in Ayar2Uni and 
propagate it up, disabling the Ayar input method as we go. If a core component 
(like WaitZar) somehow faults, kill the program.

Under this approach, things like Uni2Zawgyi become much more "optional" in the 
sense that they can be turned on and off. 

There are other things to consider as well:
  1) Some encodings don't need Uni2X, since we won't be using them for output. For example, adding a new keyboard requires X2Uni, but we might hold off on Uni2X for a while. How can we specify this?
  2) Languages should be able to share some settings. For example, if we had a decent font we could use the same "Display Method" for Shan and Myanmar. Also "Unicode" should be a system-wide encoding. How can we incorporate this into the structure of settings/languages/extensions without cluttering things too much? 

Since we're cleaning up the entire config system, we might try to deal with 
these all together. 

Step 2: Review the code again, looking for more discrepancies that need 
attention.

Original issue reported on code.google.com by seth.h...@gmail.com on 25 Jan 2011 at 8:03

GoogleCodeExporter commented 9 years ago
For storing the actual data found, the best thing I've come up with so far is a 
class called "Node". This has two data members:
   map<string, Node> childList;
   string textValue;

...both of which are private. After any "set" method is called, an 
"assertValid()" function is called, and throws an exception if "childList" or 
"textValue" are ever both non-empty. (Note that "textValue" will likely be a 
stack<string> in the final release). 

We provide a constructor that takes a "string" and also a "char*", and we have 
"operator[]" return the child at that location (or throw an exception if none 
exists). This allows us to do use nice, simple syntax like:
   Node root;
   Node child = "hello";  //Initialize now
   Node child2;
   root.addChild("key2", child);
   root.addChild("key", child2);
   root["key"] = "value"; //Initialize later

   cout <<root["key"].getString() <<"  " <<root["key2"].getString() <<endl;

Using "addChild" enforces that we don't accidentally build invalid nodes into 
the tree. We also have an "isLeaf()" function to allow easy iterating. Leaf 
nodes must always have non-empty strings. 

It's not a perfect solution, but it allows us to keep Node non-virtual, and 
keep Node relatively simplistic. 

Original comment by seth.h...@gmail.com on 26 Jan 2011 at 12:24

Attachments:

GoogleCodeExporter commented 9 years ago
A tree of Nodes is now built, and the "local" options are extracted easily 
enough using a lambda callback function. 

I am considering a breaking change: expand "${currdir}" to the current folder, 
so that:
    "${currdir}\keymagic.kms" works as expected. 

There are other options; for example, in :
    "@keymagic.kms", the @ means "it's a file".

The problem is thus:
    1) Files are always considered in the current config directory (more or less). We really don't want users loading random files. 
    2) Prepending the current directory as a prefix is a _semantic_ decision; we only do it to files.
    3) Building a parse tree is _syntax_ only. So, we don't know if property "x" is going to require the directory prepended until long after the tree is built.
    4) Unfortunately, after the tree is built, the directory information is no longer available, because each _string_ in our stack may have been loaded from a different directory. 

This is actually a generalization of the problem of types in Javascript. For 
example:
   "true" might be a boolean value, or it might be an identifier (language named "True?")
   We could just use true (JSON boolean), but that (1) won't solve our "file" type problem and (2) will confuse users to no end.

I'll probably parse booleans later anyway (it makes sense to), but it makes 
sense to cheat a bit and process file names a tiny bit now. For example:
   1) Prepend the full directory to the filename. (Fixes DLL __cwd() hack and the problem of an Input Method being re-defined later.)
   2) Run some regexes, convert "/" to "\" (required for DLLs, doesn't hurt other files), remove two "\"'s in a row, etc.
   3) Enforce that a config file can ONLY load files inside its own directory. 

The question is... how to identify files! And how to crash gracefully if the 
user fails to specify a file correctly?

Original comment by seth.h...@gmail.com on 27 Jan 2011 at 10:58

GoogleCodeExporter commented 9 years ago
Fixed the directories problem by making use of context offered later in the 
build process:
   1) If a file contains "." and then 1 to 4 letters at the end of the string, it's prepended with the complete directory.
   2) If, later, we are expecting a raw string value (and not a directory), then search for a "\" and remove everything up to and including that.

Thus, directories are handled transparently and easily.

Original comment by seth.h...@gmail.com on 27 Jan 2011 at 1:12

GoogleCodeExporter commented 9 years ago
The next step is to convert our Node tree to a "OptionSet" tree. (Hopefully we 
can merge "resolvePartialSettings" into something more intelligent here.)

I think we should make another tree of nodes, which compare and validate the 
Node tree. Let this class be Node 2, with:
  - map <string, Node2> children; //The string is what they "match" to; might need to be a regex.
  - function<> do
  - type = NODE | STRING | BOOL | IDENTIFIER | INT
  - isLeaf() -> { return type!=NODE; }

Then, we proceed by matching children from the root node, and calling do(), 
which performs things like setting the actual value or creating, e.g., a new 
"Language" with the given id. 

The benefits to this are:
  1) We don't need to check if we have "enough" identifiers. Just do the following:
     //Given Node n and Node2 n2;
     if (n.isEmpty() && n!=this->root) { throw "Empty node."; }
     if (n.isLeaf() != n2.isLeaf()) { throw "Type mismatch"; }
  2) Likewise, we can simplify exception handling. Consider:
     if (children.count(nextToken)==0) { throw "Unknown setting."; }
     //Later: catch this as "ex" and:
     msg <<"Error loading property: " <<n.getAbsKey() <<"=" <<n.str() <<" : " <<ex
     throw msg.str();
     //This allows us to force "name=value" onto every load error that occurs, with no extra work.
  3) We can now catch subtle errors that weren't caught before, like "settings.hotkey.fakesuffix"="test". We can give more reliable error messages for existing errors like "languages"="myanmar".

Some difficulties:
  1) How to flag errors such as "can't define a new Language in config-local", or "can't define extensions AT ALL outside of Common".
  2) The actual function semantics for "type" is confusing, because we want to pass, e.g., an actual boolean as "val" for type==BOOL. I foresee some template hacking.
  3) Still not sure how to avoid "partialSettings". Might have to tidy up the current "OptionTree" class. Possibly make "OptionTree2" for now, to avoid overwriting the actual option tree? 

Original comment by seth.h...@gmail.com on 28 Jan 2011 at 7:28

GoogleCodeExporter commented 9 years ago
Added a whole bunch of settings, and a basic "permissions" system.

Currently, the node's creation sets this value, and for non-leaf nodes that's 
the end of the story. We might want to update this (or even keep a separate 
stack) for each call to str(), since these must be validated too ---or at least 
the top ones should.

The other option, which is kind of intriguing, is to transform each file 
individually, and contribute to a "partial" tree as we go. Since IDs are part 
of the key (not the value), we are guaranteed to have the only "required" item 
for each element, and then we don't have to save the permissions. (But, argh, 
then we have to re-undo all the const CfgPerm& items we just removed...)

Adding this comment just so that I don't forget to debug permissions heavily.

Original comment by seth.h...@gmail.com on 31 Jan 2011 at 7:24

GoogleCodeExporter commented 9 years ago
Re-worked with the second option (transform as you go).

Now, however, we need to keep some notion of a "dirty" or "clean" node. After 
reading a node's value, it should be marked "clean", and setting it again (in 
the "build" phase) should mark it as "dirty". Clean nodes should not be read 
again on "walk". 

Theoretically, marking a node dirty should mark all parents dirty, so we can 
avoid walking entire branches of the tree at a time, but this is not strictly 
necessary. It might be easy, though; we could do it.

Original comment by seth.h...@gmail.com on 31 Jan 2011 at 8:22

GoogleCodeExporter commented 9 years ago
"dirty" and "clean" seem to work fine. I'd like to do a full trace later (with 
local/user overrides and the works) just to confirm that this is indeed working 
correctly.

Original comment by seth.h...@gmail.com on 31 Jan 2011 at 8:50

GoogleCodeExporter commented 9 years ago
Fun trivia point: did you know that "transformations" was misspelled as 
"tranformations" in Wait Zar 1.8. :P 

Original comment by seth.h...@gmail.com on 31 Jan 2011 at 9:09

GoogleCodeExporter commented 9 years ago
All settings load, and they are tracked with a new config flag: "C". 

Need to write a few dummy override config files and check the following things:
   1) Permisssions.
   2) "Change" and "Add" differences (i.e., change a language in the local conf)
   3) Dirty/clean nodes. Double-check the full tree walk. Also check for overrides. 

We also need to check for things like missing settings (e.g., a language 
without an encoding) and we need to do the "lastused" setting. It makes sense 
to replace the stack-of-strings with just a string, because if someone defines:
   languages.myanmar.default-input-method = waitzar
   languages.myanmar.default-input-method = lastused
in the same _file_, then this should be silently overwritten (or maybe even 
fail with an error).

We also have to log the actual item which is set while walking the tree.

Hmm.... not sure what to do first. :D

Original comment by seth.h...@gmail.com on 31 Jan 2011 at 11:34

GoogleCodeExporter commented 9 years ago
All the important stuff is done. 
  * Properties were (mostly) elevated out of pointer classes. 
  * Lastused doesn't actually work, but that's a minor point.
  * The code for saving local settings is already halfway in MainFile; we should take the remaining code out of RuntimeConfig and put it there. 
  * Need to meticulously scour ConfigManager for old "test case" code, put it into the new model, and then clear out all the commented code. 
  * ConfigManager's "validate" method needs to be refactored out into MainFile or... something. 

Original comment by seth.h...@gmail.com on 2 Feb 2011 at 3:07

GoogleCodeExporter commented 9 years ago
All code has been cleaned up, and generally tested. 
More tests will be necessary closer to the next release, but the work completed 
more than closes this bug.

Original comment by seth.h...@gmail.com on 4 Feb 2011 at 9:36