geophysics / mtpy

collection of python tools for standard MT data processing
GNU General Public License v3.0
32 stars 32 forks source link

Implement framework for persistent configuration settings in INI files #15

Open ghost opened 12 years ago

ghost commented 12 years ago

The changes on this branch are meant as a way to tackle all the hardcoded paths that are around. The idea is to basically have all settings that users might want to permanently change per project in an INI file. We really don't want users (including ourselves) changing the source directly for convenience, because then it's reversed every time you update the code.

Can you take a look at it, and let me know what you think of this approach? Should we do it this way? Do you understand my explanation?

So this is how my changes work:

The code that sets everything up is in mtpy/config.py

Settings are in one or more INI files. There is a 'master' INI file in the repository, at mtpy/config/mtpy.ini. Users can place their own INI files in their $HOME directory, or in the current working directory. The two latter INI files need not have all the settings in them, because config.py will look at each one in turn, overwriting the settings with those it finds in more 'local' INI files. So the 'master' INI has the lowest priority, and the current working directory INI has the highest priority. It doesn't matter if there are no INI files in the $HOME directory or the working directory.

When config.py is imported it loads the INI file into a SafeConfigParser object, which is a bit like a dictionary, and makes it available as the config object in its module namespace. So to access this object from anywhere in mtpy code, use:

>>> from mtpy.config import config

Changes can't be made during an interpreter session. This might be modified later, but I thought we should just keep it simple for now. The config object is a bit like a dictionary, but not as nice. It has separate sections, which contain name/value pairs, but to get at the values you need to call the get(section, name) method. So for an INI file that looks like:

[BIRRP]
bbfile = Z:\instruments\bbconv.txt

to get at the settings:

>>> from mtpy.config import config
>>> config.get('BIRRP', 'bbfile')
'Z:\\instruments\\bbconv.txt'    

This is awkward, but I think it is the best cross-platform way to do things. For example, in some module somewhere in mtpy you would do the following:

import numpy as np
# other imports etc

from mtpy.config import config

def do_something(arg1, arg2, path_to_program=config.get('blah', 'blah_path')):
    '''blah blah'''
    pass

The module that this is based on (ConfigParser) is in the standard library. There are other and better choices for doing configuration, but none of them are standard library modules, and they're only slight improvements. I think we are better off sticking with an option that everyone has installed by default. More information about how SafeConfigParser objects work is here:

http://docs.python.org/library/configparser.html http://www.doughellmann.com/PyMOTW/ConfigParser/#accessing-configuration-settings

For now, I haven't actually added any settings. I just wanted to propose this as a framework before going ahead and adding settings everywhere.

Also, a minor point, but if for some reason the mtpy.ini file in the source folder is missing or broken for some reason, config.py will try to download the latest version from Github.

So... does that make sense? Do you like it?

geophysics commented 12 years ago

Sounds like a good plan in general, but I have 2 comments.

1. Again, I've never had these issues, because I try to define things like executables rather locally,or my codes rely on their own config files anyway - and I think this approach is not uncommon. Thus, I'd like to have the possibility to NOT use another config file - perhaps by using a function doing the if-conditions like:

if use_config: path_to_program=config.get('blah', 'blah_path') else: path_to_program=local_variable_defined_by_own_wrapping_script

2. I find the name INI confusing/not very intuitive. I just know config files, containing config parser dicts, as "config files", mainly with the matching suffix .cfg. And I am happy that configuration files *.cfg are handled by config.py. But a file named "INI" (by the way, why capitalised??) sounds more like a file/script that actively initialises something, which is not the case, if I understand you correctly..?

Cheers,

L.

ghost commented 12 years ago
  1. We need to distinguish between a script and a module. Of course values in a script may be very specific to a situation, and I don't intend this approach to be used for that, at all. Such values should indeed be hard-coded in the script and intended for the user to change them there, or by parsing sys.argv, or something. Not this approach.

I am talking about settings which are used in imported modules, such as the path to the Occam executable. At the moment for example your GUI is a real pain for me to use because every time I run it I need to change all the paths. It quickly becomes more than just an irritation. Changing the paths hard-coded in the GUI script solves nothing, because then our versions of the script differ in the repository, and it's either a pain for you or a pain for me. Or, when we have a thousand users, a thousand different pains! That is the situation this branch is intended to solve. Then every user -- if they find changing the path in the GUI irritating -- has the option of creating an INI file in their $HOME directory containing the relevant setting, and any module using the Occam executable gets the path from mtpy.config. I can think of many many examples where this is used, Git and matplotlib spring to mind.

  1. I used INI because that's what the ConfigParser documentation in the standard library uses. Many programs on Windows use INI. .cfg is fine with me. I'll wait to see what Jared thinks.

On 15 October 2012 18:27, Geophysics Adelaide notifications@github.comwrote:

Sounds like a good plan in general, but I have 2 comments.

1. Again, I've never had these issues, because I try to define things like executables rather locally,or my codes rely on their own config files anyway - and I think this approach is not uncommon. Thus, I'd like to have the possibility to NOT use another config file - perhaps by using a function doing the if-conditions like:

if use_config: path_to_program=config.get('blah', 'blah_path') else: path_to_program=local_variable_defined_by_own_wrapping_script

2. I find the name INI confusing/not very intuitive. I just know config files, containing config parser dicts, as "config files", mainly with the matching suffix .cfg. And I am happy that configuration files *.cfg are handled by config.py. But a file named "INI" (by the way, why capitalised??) sounds more like a file/script that actively initialises something, which is not the case, if I understand you correctly..?

Cheers,

L.

— Reply to this email directly or view it on GitHubhttps://github.com/geophysics/mtpy/pull/15#issuecomment-9437004.

ghost commented 12 years ago

In fact, I disagree with the first point you made. Any settings which users will want to change permanently should never be only contained in the Python script or module, because then when the user updates their system with a new version of mtpy, the changes they intended to be permanent are destroyed. Or if they're using Git, they get a merge conflict.

The point of this branch is to allow them to keep those settings on their system somewhere other than the mtpy source code.

Don't forget there are still hardcoded default settings -- they are just consolidated into one settings file. We can split configuration settings up into separate files, even a separate one per script or per module but that will make things much more complicated, essentially requiring every script or module to maintain code like that that is in config.py. I don't understand why that would be better.

geophysics commented 12 years ago

Mhhh, I think the point is that I do not see the necessity for pre-defined paths in modules at all. I just restrict it to the scripts, which use the modules, e.g. paths to executables. Exactly to avoid potential conflicts with future work/portability and so on. I've not had the case that I wanted to store parameters permanently so far. I think that is my main issue here, which confuses me.

[Ok ok, veeeery bad example, I admit... please do not use the current state of the gui as an argument - that is mainly due to the unfortunate mix of my lazyness/work in progress/trying out QT designer - in the end, there should not be any pre-defined paths in there - perhaps just a possibility to click "save current settings"]

So, I am still not sure: is it then optional to rely on the config files or will there be an explicit call looking e.g. for executables in there?

On 15/10/12 18:42, nietky wrote:

  1. We need to distinguish between a script and a module. Of course values in a script may be very specific to a situation, and I don't intend this approach to be used for that, at all. Such values should indeed be hard-coded in the script and intended for the user to change them there, or by parsing sys.argv, or something. Not this approach.

I am talking about settings which are used in imported modules, such as the path to the Occam executable. At the moment for example your GUI is a real pain for me to use because every time I run it I need to change all the paths. It quickly becomes more than just an irritation. Changing the paths hard-coded in the GUI script solves nothing, because then our versions of the script differ in the repository, and it's either a pain for you or a pain for me. Or, when we have a thousand users, a thousand different pains! That is the situation this branch is intended to solve. Then every user -- if they find changing the path in the GUI irritating -- has the option of creating an INI file in their $HOME directory containing the relevant setting, and any module using the Occam executable gets the path from mtpy.config. I can think of many many examples where this is used, Git and matplotlib spring to mind.

  1. I used INI because that's what the ConfigParser documentation in the standard library uses. Many programs on Windows use INI. .cfg is fine with me. I'll wait to see what Jared thinks.

On 15 October 2012 18:27, Geophysics Adelaide notifications@github.comwrote:

Sounds like a good plan in general, but I have 2 comments.

1. Again, I've never had these issues, because I try to define things like executables rather locally,or my codes rely on their own config files anyway - and I think this approach is not uncommon. Thus, I'd like to have the possibility to NOT use another config file - perhaps by using a function doing the if-conditions like:

if use_config: path_to_program=config.get('blah', 'blah_path') else: path_to_program=local_variable_defined_by_own_wrapping_script

2. I find the name INI confusing/not very intuitive. I just know config files, containing config parser dicts, as "config files", mainly with the matching suffix .cfg. And I am happy that configuration files *.cfg are handled by config.py. But a file named "INI" (by the way, why capitalised??) sounds more like a file/script that actively initialises something, which is not the case, if I understand you correctly..?

Cheers,

L.

— Reply to this email directly or view it on GitHubhttps://github.com/geophysics/mtpy/pull/15#issuecomment-9437004.

— Reply to this email directly or view it on GitHub https://github.com/geophysics/mtpy/pull/15#issuecomment-9437290.

Dr. Lars Krieger IMER Postdoctoral Fellow

South Australian Centre for Geothermal Energy Research (SACGER) Geophysics & Geology School of Earth & Environmental Sciences The University of Adelaide Adelaide SA 5005, Australia Tel: +61 8 8313 3260 Fax: +61 8 8313 4347 Email: lars.krieger@adelaide.edu.au

ghost commented 12 years ago

OK, sorry that we are going in circles here. I'll try to sell it one last time for tonight :-)

is it then optional to rely on the config files or will there be an explicit call looking e.g. for executables in there?

I don't think it should be compulsory to use the config framework proposed here, but it may simplify code and make it easier for users in many cases. I think we are getting tied up in abstractions like

if use_config:
    path_to_program=config.get('blah', 'blah_path')
else: 
    path_to_program=local_variable_defined_by_own_wrapping_script

so let's address a concrete example:

Every time we run the Occam GUI, the path to the Occam executable is incorrect for me. It is set to /data/software/occam/bin/Occam2D at https://github.com/geophysics/mtpy/blob/master/mtpy/utils/gui/occam2d/v1/gui4.py#L869

That module is obviously generated automatically, so I don't propose changing anything in there. But if this proposal were adopted, by adding the line

self.ui.lineEdit_browse_occam = config.get('occam2d', 'occam2d_exe')

at https://github.com/geophysics/mtpy/blob/master/mtpy/utils/gui/occam2d/v1/occamgui_v1.py#L48, and adding the following to the proposed mtpy.ini:

[occam2d]
occam2d_exe = /data/software/occam/bin/Occam2D

and writing a new file to my user directory, c:\users\nietky\mtpy.ini, containing

[occam2d]
occam2d_exe = z:\code\occam2d.exe

the behaviour is changed: now when the GUI is run, the path to Occam is correct for you, and correct for me.

For this specific case: is this a problem worth solving? Do you agree that the proposed changes would solve it? Can you propose a better way to go about solving it?

in the end, there should not be any pre-defined paths in there - perhaps just a possibility to click "save current settings"

Where would the current settings be saved to? How would they be loaded when the GUI is next run? These questions are rhetorical, and I'm not necessarily interested in the actual details of how you would do it -- not a difficult problem. But where would the code go to do this? You would write a couple of functions to save and write the files, and parse them. Call it the save-write-and-parse-settings-code. You would presumably include it in the GUI script. But what happens when we write a new GUI for a different modelling program, or a GUI for BIRRP? These would also have similar problems: paths to executables, perhaps other settings like the choice of coordinate system: lat/longs or UTM. Would we then re-write the save-write-and-parse-settings code into the new GUI? Copy it? It is generally a terrible idea to duplicate significant portions of code. That advice is everywhere. A better solution would be to take that code (to save-write-and-parse-user-settings) and put it in a common place where any future GUIs can access it. That is precisely what this module would provide.

Even worse, sometimes our separate GUIs/modules might share the same setting that the user might want to universally set. An example might be a choice between metric and imperial, although I expect we are all SI fans here.

Another example: the variables bbfile and birrploc are keyword arguments to many functions in birrptools.py, for example:

https://github.com/geophysics/mtpy/blob/master/mtpy/processing/birrptools.py#L1085

If this code were changed to:

def function(..., bbfile=config.get('birrp', 'bbfile'), ...)

or

def function(..., bbfile=None, ...)
    ...
    if bbfile is None:
        bbfile = config.get('birrp', 'bbfile')

then no functionality is lost, because anything calling function can still explicitly provide a value for bbfile. But for most cases bbfile is not expected to change or be important, and its value may be different for different users (example: a geophysics department maintains technical material for its instrument on a separate server). Then surely more functionality is provided by providing the value in a configuration file.

Currently, the function I linked to attempts to find a bbfile according to the value of birrploc. Why not do this once, as part of package initialisation triggered by setup.py? This is what many other packages do. Users can then address the issue of incorrect bbfile and birrploc values by adjusting their configuration files. Again, this is what many other packages do.

Futher examples from code of mine that I'll try to work into mtpy some day:

https://github.com/nietky/phd/blob/master/phd_modules/oc1d.py#L18 https://github.com/nietky/phd/blob/master/phd_modules/oc2d.py#L25 https://github.com/nietky/phd/blob/master/phd_modules/ps02.py#L25

These are all configurable settings that will need a configuration system. Yes, code can automatically assist with establishing them, but such code is initialisation code, and belongs in setup.py, not in some function which is devoted to a different purpose.

Further examples: configuring default plot styles. Default coordinate systems. Anything you can think that makes sense with 'default' ahead of it.

ghost commented 12 years ago

save current settings

OK I realize this branch lacks a few features, so I will add them:

That should provide all the functionality you'd need for a 'save settings' option in the Occam GUI.

kujaku11 commented 12 years ago

I agree with everything Kent has described in detail.

On Tue, Oct 16, 2012 at 9:22 AM, nietky notifications@github.com wrote:

save current settings

OK I realize this branch lacks a few features, so I will add them:

  • ability for Python code to easily add/change settings
  • ability for Python code to load .cfg files from an arbitrary location
  • ability for Python code to write .cfg files to an arbitrary location

That should provide all the functionality you'd need for a 'save settings' option in the Occam GUI.

— Reply to this email directly or view it on GitHubhttps://github.com/geophysics/mtpy/pull/15#issuecomment-9460114.