choderalab / ensembler

Automated omics-scale protein modeling and simulation setup.
http://ensembler.readthedocs.io/
GNU General Public License v2.0
52 stars 21 forks source link

Default pH for refinement stages is currently 8 - change to 7? #18

Open danielparton opened 9 years ago

danielparton commented 9 years ago

I believe we decided to use 8 because it matches the buffer we use in our experimental assays. However, I think for the general user, it would make more sense for the default value to be 7. We (the Chodera lab) would just have to remember to set the pH to 8 (via the CLI) when necessary.

jchodera commented 9 years ago

Sounds reasonable!

danielparton commented 9 years ago

While I'm doing this, should I also make timestep, temperature and collision rate controllable from the CLI? They are currently only controllable via the API. And pressure and barostat period? It is already possible to select force field, water model, and simulation length through the CLI and API.

On Tue, Mar 31, 2015 at 8:57 PM, John Chodera notifications@github.com wrote:

Sounds reasonable!

— Reply to this email directly or view it on GitHub https://github.com/choderalab/ensembler/issues/18#issuecomment-88298426.

jchodera commented 9 years ago

This could be useful, especially for the Sander group if they want to use the CLI.

In YANK, I use this (unfortunately not very secure) way of processing unit-bearing arguments that may be of use: https://github.com/choderalab/yank/blob/master/Yank/commands/prepare.py#L103-L139

danielparton commented 9 years ago

Thanks - I've implemented a couple of functions to deal with this, which might also be helpful be in yank: https://gist.github.com/danielparton/562973c33cf25870e474

parse_api_params_string()

The idea behind this is to have CLI flags for commonly used options, and a single --api_params flag which allows access to less commonly used (or "advanced") parameters. This avoids the need for the programmer to specify a separate flag for each parameter for which CLI control is desired.

The params are specified on the command-line as a string representing a dict (in Python syntax). The string is safely evaluated by the parse_api_params_string function, then passed on to the appropriate API function using the **kwargs syntax. The following should hopefully make this clear.

Command-line:

ensembler refine_implicit --api_params '{"collision_rate": 10 / picoseconds, "api_arg2": "x", "api_arg3": 2.4}'

Python internals:

>>> api_params = parse_api_params_string('{"collision_rate": 10 / picoseconds, "api_arg2": "x", "api_arg3": 2.4}')
>>> print api_params
{'collision_rate': Quantity(value=10, unit=/picosecond), 'api_arg2': 'x', 'api_arg3': 2.4}
>>> refine_implicit_md(positional_arg, specified_kwarg=2.0, **api_params)

The parser function is built using the ast library, and can handle int, str, dict, simtk units, and simple mathematical expressions (+, -, *, /, etc.), and nothing else (i.e. pretty safe - much safer than using eval).

eval_quantity_string()

This is basically just to allow the user to use either '2 picoseconds' or '2 * picoseconds' syntax when specifying a quantity via a main CLI flag, since the former would probably be more intuitive to the lay user. All of the following are valid, and evaluate as one would expect:

eval_quantity_string('2 picoseconds')
eval_quantity_string('2 * picoseconds')
eval_quantity_string('2 / picoseconds')
eval_quantity_string('2')   # evaluates as the int 2.

An obvious extension would be to allow use of 'ps' in place of 'picoseconds', etc.