argiopetech / base

Bayesian Analysis for Stellar Evolution
http://webfac.db.erau.edu/~vonhippt/base9/
11 stars 4 forks source link

Fixes issue where command line ignored if YAML exists #62

Closed maxvonhippel closed 7 years ago

maxvonhippel commented 7 years ago

This PR fixes the issue identified by @sishijing (https://github.com/argiopetech/base/issues/61) where if you try to use both command line args and a YAML file the command line args are ignored. It prioritizes command line args over YAML, so if you set a value in both YAML and the command line, the command line value will be the one used. This makes sense to me since if you set a value explicitly in the command line it seems likely that is the value you want and you probably just forgot to remove the other value from the YAML.

I also set the default values for prior means and starting carbonicity to 0.38 per @tedvh's request in the Settings.cpp ternaries. I am not sure how best to set these in the actual cluster definition. They appear to be right already in the included default YAML file, and reading Cluster.cpp and Cluster.hpp, I don't see a clear way to set these defaults in their original definition. That said, if you have a suggestion for where to look, let me know and I'll give it a shot.

tedvh commented 7 years ago

Comments below

On 1/6/17 7:02 AM, Max von Hippel wrote:

This PR fixes the issue identified by @sishijing https://github.com/sishijing (#61 https://github.com/argiopetech/base/issues/61) where if you try to use both command line args and a YAML file the command line args are ignored. It prioritizes command line args over YAML, so if you set a value in both YAML and the command line, the command line value will be the one used. This makes sense to me since if you set a value explicitly in the command line it seems likely that is the value you want and you probably just forgot to remove the other value from the YAML.

Excellent and thanks Max. I agree that this is the way to go.

I also set the default values for prior means and starting carbonicity to 0.38 per @tedvh https://github.com/tedvh's request in the Settings.cpp ternaries. I am not sure how best to set these in the actual cluster definition. They appear to be right already in the included default YAML file, and reading Cluster.cpp and Cluster.hpp, I don't see a clear way to set these defaults in their original definition. That said, if you have a suggestion for where to look, let me know and I'll give it a shot.

Not sure where else carbonicity might be. My guess is that is everywhere that it needs to be set. Elliot can confirm when he looks at and merges in your commits.

thanks! -Ted


    You can view, comment on, or merge this pull request online at:

https://github.com/argiopetech/base/pull/62

    Commit Summary

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/argiopetech/base/pull/62, or mute the thread https://github.com/notifications/unsubscribe-auth/AEOeMuqQMzmo8rxx2nu8DOAwLhZTQ_zcks5rPebygaJpZM4Lcdfx.

argiopetech commented 7 years ago

The problem this is supposed to fix (#61) is solved by 324669b.

maxvonhippel commented 7 years ago

Thanks @argiopetech . That said, do you disagree that we should have the ternaries like in this commit? Because right now (as I said in my comments above) the command line args are ignored if the same argument appears in the YAML, and it seems like the command line should be prioritized over the YAML.

tedvh commented 7 years ago

I forgot to ask when this came up what ternaries are.

The goal was to have command-line args supercede anything in the yaml file.

On 1/7/17 4:30 PM, Max von Hippel wrote:

Thanks @argiopetech https://github.com/argiopetech . That said, do you disagree that we should have the ternaries like in this commit? Because right now (as I said in my comments above) the command line args are ignored if the same argument appears in the YAML, and it seems like the command line should be prioritized over the YAML.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/argiopetech/base/pull/62#issuecomment-271093693, or mute the thread https://github.com/notifications/unsubscribe-auth/AEOeMjAm4_SCoggebcbXGGbQ2YXsqRRHks5rP72ggaJpZM4Lcdfx.

argiopetech commented 7 years ago

We actually have a much cleaner method of handling this already in the code. You'll note that every call to fromYaml happens in a code block something like the below (taken directly from singlePopMcmc/mpiMcmc.cpp):

        settings.fromCLI (argc, argv);

        if (!settings.files.config.empty())
        {
            settings.fromYaml (settings.files.config);
        }
        else
        {
            settings.fromYaml ("base9.yaml");
        }

        settings.fromCLI (argc, argv);

The first call to fromCLI loads the --config parameter which may specify a YAML file, along with doing some short-circuit exiting for things like the --version flag. If the YAML file is provided we use that, if not there's a default of base9.yaml. Running fromYAML overrides all non-CLI-only commands, so we then have to call fromCLI again to override the YAML defaults.

argiopetech commented 7 years ago

I do think that this could be encapsulated into one function

void Settings::loadSettings(int argc, char* argv[], const std::string& defaultFile="base9.yaml")
{
    fromCLI (argc, argv);

    if (!files.config.empty())
    {
        fromYaml (files.config);
    }
    else
    {
        fromYaml (defaultFile);
    }

    fromCLI (argc, argv);
}

so we could just call settings.loadSettings(argc, argv) in the various applications while maintaining the functionality to have a different default YAML files for a specific application (e.g., settings.loadSettings(argc, argv, "singlePopMcmc.yaml").

The only reason it isn't this way already is because it was implemented piece by piece historically (CLI functionality came after YAML files), and it hasn't been broken since then (so I haven't changed it).

argiopetech commented 7 years ago

I forgot to ask when this came up what ternaries are.

Ternaries are the if/then/else format formatted as such:

<conditional> ? <do if true> : <do if false>

which is equivalent to

if (<conditional>)
{
    <do if true>;
}
else
{
    <do if false>;
}

It's just a shortcut, but if overused or poorly formatted they can become very hard to read. I prefer to reserve them for conditional assignment, and I use them to this end several places in the code for brevity. As such, I have no issues with the way @maxvonhippel has used them here.

The goal was to have command-line args supercede anything in the yaml file.

This has been the case since the CLI functionality was implemented. The only reason it wasn't working for Shijing was that one of the command line flags he was using wasn't recognized because I had failed to implement it.

argiopetech commented 7 years ago

I went ahead and implemented the Settings::loadSettings() function and integrated it into all applications in 3dfadaa.

maxvonhippel commented 7 years ago

You're right that's infinitely more elegant. Definitely feel foolish now for writing a billion ternary statements when I could have just switched the order of the settings.

maxvonhippel commented 7 years ago

That said, we should change the getOrRequest behavior so that it checks YAML, checks

tedvh commented 7 years ago

gotcha on both items. thanks.

On 1/7/17 5:02 PM, Elliot Robinson wrote:

I forgot to ask when this came up what ternaries are.

Ternaries are the if/then/else format formatted as such:

? :

which is equivalent to

|if {

} else { } | It's just a shortcut, but if overused or poorly formatted they can become very hard to read. I use them several places in the code for brevity, and I have no issues with the way @maxvonhippel has used them here. The goal was to have command-line args supercede anything in the yaml file. This has been the case since the CLI functionality was implemented. The only reason it wasn't working for Shijing was that one of the command line flags he was using wasn't recognized because I had failed to implement it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or mute the thread .
maxvonhippel commented 7 years ago

Sorry, on my iPhone. So it checks YAML, then command line, then asks for anything that's missing. Rather than YAML, ask for missing stuff, then command line.

argiopetech commented 7 years ago

You're right that's infinitely more elegant. Definitely feel foolish now for writing a billion ternary statements when I could have just switched the order of the settings.

No worries, a big part of that was reliant on my latency... Apologies again. Had I gotten to you before the code we could have nipped this in the bud.

That said, we should change the getOrRequest behavior so that it ... checks YAML, then command line, then asks for anything that's missing. Rather than YAML, ask for missing stuff, then command line.

Ted may be a better person to ask than me on this one. I'm of the opinion that the getOrRequest functionality is for cases where we (the code maintainers) have updated the YAML specification but the end user has not updated their local YAML file. The case where an end user has overridden a required YAML setting from the command line without updating their YAML file should never happen. Adding logic to handle this potential edge case seems excessive to me.

tedvh commented 7 years ago

On 1/7/17 7:42 PM, Elliot Robinson wrote:

That said, we should change the getOrRequest behavior so that it ... checks
YAML, then command line, then asks for anything that's missing. Rather than
YAML, ask for missing stuff, then command line.

Ted may be a better person to ask than me on this one. I'm of the opinion that the |getOrRequest| functionality is for cases where we (the code maintainers) have updated the YAML specification but the end user has not updated their local YAML file. The case where an end user has overridden a required YAML setting from the command line without updating their YAML file should never happen. Adding logic to handle this potential edge case seems excessive to me.

This is an interesting one. I hadn't thought of what you are saying Elliot, i.e. this is to fix out-of-date yaml files with (now) missing entries.

I'm not sure what you (Elliot) mean by "The case where an end user has overridden a required YAML setting from the command line without updating their YAML file should never happen". If you mean that I might specify a parameter in the YAML file, but then run BASE9 with a different value for that same parameter from the command line, that will indeed happen. I do that when I want to run the same data set through BASE9 N times with different parameters and just have a script do this with changing command line arguments.

This also raises an interesting issue. Can we easily let users know that their yaml file is out of date? In the case you mention, Elliot, where BASE9 wants a parameter that isn't even listed in the yaml file, would there be an easy way to recognize that and throw a warning? Not a stop, just a warning?

good discussions!

-Ted

argiopetech commented 7 years ago

I'm not sure what you (Elliot) mean by "The case where an end user has overridden a required YAML setting from the command line without updating their YAML file should never happen". If you mean that I might specify a parameter in the YAML file, but then run BASE9 with a different value for that same parameter from the command line, that will indeed happen. I do that when I want to run the same data set through BASE9 N times with different parameters and just have a script do this with changing command line arguments.

I mean that if I add a new parameter foo to the YAML file without telling you, and you therefore run BASE9 without updating your YAML file, it should ask you for a value for foo via Max's getOrRequest. There should very seldom be a case where you don't have foo in your YAML file but have used the theoretical --foo flag, such that Max's getOrRequest would trigger, but the value you specify to getOrRequest would then be overridden by the --foo flag you passed at the command line.

Basically, if that last thing were ever to happen, it's probably a sign that you have used an old version of your YAML file.

This also raises an interesting issue. Can we easily let users know that their yaml file is out of date? In the case you mention, Elliot, where BASE9 wants a parameter that isn't even listed in the yaml file, would there be an easy way to recognize that and throw a warning? Not a stop, just a warning?

That's pretty much what Max's code does now, in that it allows you to enter a value for that missing parameter and continue.

tedvh commented 7 years ago

Hi Elliot -

OK, it sounds like your order (yaml, prompt, command-line) does make sense, though that wasn't intuitive to me either. I'd suggest then, that if it isn't too cumbersome, Max's update could warn the user that the missing item has no entry in the yaml file. Maybe that is already obvious to Max, though it just occurred to me.

thanks gents, -Ted

On 1/7/17 8:11 PM, Elliot Robinson wrote:

I'm not sure what you (Elliot) mean by "The case where an end user has
overridden a required YAML setting from the command line without updating
their YAML file should never happen". If you mean that I might specify a
parameter in the YAML file, but then run BASE9 with a different value for
that same parameter from the command line, that will indeed happen. I do
that when I want to run the same data set through BASE9 N times with
different parameters and just have a script do this with changing command
line arguments.

I mean that if I add a new parameter |foo| to the YAML file without telling you, and you therefore run BASE9 without updating your YAML file, it should ask you for a value for |foo| via Max's |getOrRequest|. There should very seldom be a case where you don't have |foo| in your YAML file but have used the theoretical |--foo| flag, such that Max's |getOrRequest| would trigger, but the value you specify to |getOrRequest| would then be overridden by the |--foo| flag you passed at the command line.

Basically, if that last thing were ever to happen, it's probably a sign that you have used an old version of your YAML file.

This also raises an interesting issue. Can we easily let users know that
their yaml file is out of date? In the case you mention, Elliot, where BASE9
wants a parameter that isn't even listed in the yaml file, would there be an
easy way to recognize that and throw a warning? Not a stop, just a warning?

That's pretty much what Max's code does now, in that it allows you to enter a value for that missing parameter and continue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/argiopetech/base/pull/62#issuecomment-271107057, or mute the thread https://github.com/notifications/unsubscribe-auth/AEOeMs4Its8qNXTTGNsYusMeWDtNwiZmks5rP_GOgaJpZM4Lcdfx.

argiopetech commented 7 years ago

Done with 3106da6. The updated verbage for getOrRequest is:

No value was found in the YAML file for parameter 'pi'.
Please enter your desired value for this parameter: 3.1415

Notably, I've added the clarification for YAML, removes the newline after the colon (I tend to prefer same-line input unless the text line is very long), and uses cerr instead of cout to avoid potential issues with flushing.

tedvh commented 7 years ago

sounds good!

On 1/7/17 8:26 PM, Elliot Robinson wrote:

Done with 3106da6 https://github.com/argiopetech/base/commit/3106da615291dade59f250c96a5113fa57c2fed1. The updated verbage for |getOrRequest| is:

No value was found in the YAML file for parameter 'pi'. Please enter your desired value for this parameter: 3.1415

Notably, I've added the clarification for YAML, removes the newline after the colon (I tend to prefer same-line input unless the text line is very long), and uses |cerr| instead of |cout| to avoid potential issues with flushing.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/argiopetech/base/pull/62#issuecomment-271107842, or mute the thread https://github.com/notifications/unsubscribe-auth/AEOeMiF_zK3S-TqUo7-vI1uRbvJD0P_mks5rP_TcgaJpZM4Lcdfx.