PaulWessel / CodingTasks

Organized issues tracker for tasks related to GMT
0 stars 0 forks source link

Basic information about the plan for long-format GMT options #1

Open PaulWessel opened 2 years ago

PaulWessel commented 2 years ago

Background

The most recent NSF proposal for GMT support discussed introducing more human-readable command-line options in GMT so that it would be possible to understand what GMT options are doing without having to consult a manual for translation. For instance, current a GMT command line might look like this:

gmt blockmean infile.txt -R0/30/0/30 -I2/1 -Wi > out.txt

where it is not clear to anyone new what this means. The anticipated long-format version of the same command would be

gmt blockmean infile.txt --region=0/30/0/30 --increment=2/1 --weights=in > out.txt

which is much clearer. GMT modules all rely on two parsers: _GMT_ParseCommon (an API function) for all common options (-R -J, etc.) and the static local functions parse (one in every module C file) for the module-specific options (e.g., -W above). In order to implement long-options the decision was made to introduce a translator from the long-format syntax to the corresponding short-format syntax that can be called just before the established parsers are called. A prototype translator called gmtinit_translate_to_short_options was coded and is tested just for the common options. The translator will depend on two arrays of structures:

  1. _gmt_commonkw: Common to all modules, it is defined in gmt_common_longoptions.h
  2. _modulekw: This is a local array of structures define at the top of each module C file and passed into _gmt_initmodule.

The _gmt_initmodule function does lots of stuff, now including the translation based on these two vocabularies. For now only a few modules have local structures, e.g., blockm*.c.

Option Template

The premise for the translation is that the two forms (long and short) are equivalent. This is not necessarily true for everything even though it certainly is true for many options. Ideally, short-format options should follow this template:

-<short_option>[<short_directives>][+<short_modifier1>[<argument1>]][+<short_modifier2>[<argument2>]][...]

while the long-format options look like

--<long_option>[=<long_directives>[:][<arg>][+<long_modifier1>[=<arg1>]][+<long_modifier2>[=<arg2>]][...]

Thus, depending if an option has directives or not, and if those take values, things can look like this (keeping modifiers off for now):

--<long_option>
--<long_option>=<arg>
--<long_option>=<long_directives>
--<long_option>=<long_directives>:<arg>

I think some options may take more than one directives or arguments, and these may be separated by commas or slashes (as specified in the structure).

Historically, some short options in GMT have changed from being extremely long and cumbersome to now still being complicated even after being split into multiple options. The most important one of those is the -B option for basemap annotation and frame selection. To not get bogged down in intractable situations I propose we simply wait with -B which I suspect will be a special case that cannot simply be translated back and forth. But I do not know yet. Below, we can explore which options satisfy or do not satisfy the round-trip translation criterion.

surface

Here is the list of non-common options available in surface:

-G<outgrid>[=<ID>][+d<divisor>][+n<invalid>][+o<offset>|a][+s<scale>|a][:<driver>[/<dataType>][+c<options>]]
-I<xinc>[+e|n][/<yinc>[+e|n]] 
[-A<aspect_ratio>|m]
[-C<convergence_limit>] [-D<breakline>[+z[<zlevel>]]]
[-Ll|u<limit>]
[-M<radius>]
[-N<n_iterations>]
[-Q[r]]
[-S<search_radius>[m|s]]
[-T[b|i]<tension>]
[-W[<logfile>]]
[-Z<over_relaxation>]

First problem is the -G output grid where we historically have used = to append a specific grid format code. Hence, that might look like a long-format directive but it is short-format syntax. However, all the rest seems straightforward to make into long-options:

--increments=<xinc>[+exact|number][/<yinc>[+exact|number]] 
[--aspect=<aspect_ratio>|middle]
[--convergence=<convergence_limit>]
[--breakline=<breakline>[+zvalue[<zlevel>]]]
[--limit=lower|upper:<limit>]
[--mask=<radius>]
[--iternations=<n_iterations>]
[--quick[=region]]
[--radius=<search_radius>[m|s]]
[--tension=[boundary|interior:]<tension>]
[--logfile=[<logfile>]]
[--relax=<over_relaxation>]

The -I is a near-common option and it is already implemented via the constant GMT_INCREMENT_KW defined in gmt_constants.h. I lists '/' as separator, meaning the arguments may be repeated with a slash between them.

Here is a surface command from one of the tests (test/surface/limits.sh) in short and then as a plausible long-format:

gmt surface @Table_5_11.txt -R0/6.3/0/6.3 -I0.1 -GLu_limits.grd -Lu901
gmt surface @Table_5_11.txt --region=0/6.3/0/6.3 --increment=0.1 --outfile=Lu_limits.grd --limit=upper:901

Selecting suitable long-format words for directives and modifiers

Because both PyGMT and especially GMT.jl have moved ahead and implemented lots of keyword/value pairs, we should strive to use their word choices unless we feel it is just the wrong phrase. Thus, it is likely that Paul will have opinions on any selection. Hence, the initial selection of words is not crucial since they are just a set of unique strings (per module). Roger can invent suitable phrases that he things makes sense and fill out the KW structure arrays. Once a module has long-options that work we can revise the actual words as we see fit.

rbdavis commented 2 years ago

Paul, this certainly helps, but many questions remain.

I do not understand how a long-option with multiple directives and/or multiple modifiers would be expressed, nor how any particular directive or modifier with multiple arguments (if that's possible) would be expressed. It seems that there must somehow be special separator characters between (i) one directive set (i.e., a directive and all its arguments together) and the next, (ii) a directive and its first argument, and (iii) each succeeding argument for the same directive after the first argument (if that's possible). And similarly for modifiers. With the latter, + is clearly the separator that indicates a modifier is to follow, but that's perhaps not a surefire guarantee as some arguments may be numeric values which may be plus-signed (either mantissa or exponent)?

Can you give me an example of a long-option specification with (i) three directives, where the first takes 0 arguments, the second takes 1 argument, and the third takes 2 arguments, and (ii) three modifiers, where again the first takes 0 arguments, the second takes 1 argument, and the third takes 2 arguments? Is such an option even possible, or are there restrictions as to how many directives or modifiers can be present and/or how many arguments any particular directive or modifier can take?

Here is my uninformed attempt at providing such an example for module longopttestmodule with long-option longopt, using a syntax format which is hopefully somewhat like what you have in mind (but almost certainly not exactly the same, I'm sure). longopt has 3 possible directives ld0, ld1, and ld2 taking 0, 1 and 2 arguments respectively. It also has 3 possible modifiers lm0, lm1 and lm2 again taking 0, 1 and 2 arguments respectively. Does the following look anything like what you have in mind?

gmt longopttestmodule \ --longopt=ld0::ld1:ld1arg::ld2:ld2arg0:ld2arg1+lm0+lm1:lm1arg+lm2:lm2arg0:lm2arg1

Here all arguments for any particular directive or modifier are separated from the preceding directive or modifier keyword and any other argument for that same directive or modifier by a single colon, while each directive set (i.e., a directive and its arguments) is separated from an adjacent directive set by a double-colon. Each modifier set (i.e., a modifier and its arguments) is separated from any adjacent modifier set and from any preceding directive set by the plus sign.

The above is not meant as a proposal (although I suppose it could be made one) but rather only an attempt by me to figure out how you are thinking that multiple directives/modifiers (each with possibly multiple arguments) will be specified. Actual use of the syntax I have imagined here could fail of course if existing GMT arguments make any use of single- or double-colons (as they may well do, I expect).

PaulWessel commented 2 years ago

I don't think there are options with three directives as you describe. Just from memory I think most options have a single directive only. A few certainly have more but they are probably comma or slash separated. So the code splits those and parses them separately; or, it is just a single argument and not a directive at all. One that comes to mind might be the -A option in blockm*n.c. In short-format mode it could look like this

-Azsl

I imagine this becoming

--fields=data,std,low

This is certainly more understandable and should have a one-to-one translation. So parsing this would not be hard: First strip off any modifiers (there is none here), then look for equal sign which means there are directives, then strtok along to parse each one and build the short version by appending the corresponding letters z, s, and l.

Let's look at another: -C in grdtrack. Here is an actual option from a test:

-C800k/50k/100k+d30+l

Here we are limited in that there are no directives setting the 2nd and 3rd argument; it is just one argument with no directives at all. If we stick to this syntax the long option would be something like

--crosslines=800k/50k/100k+deviation=30+left

Unless we made backwards-compatible changes to the original -C option we cannot do things like introducing new directives

--crosslines=length:800k,spacing:50k,inc:100k+deviation=30+left

Now that is much clearer, of course but we cannot easily do that via a table structure linking the past and the future. In fact, I do not think we have anything that sets several args individually like that in short-format. If there are many args they are typically slashed together, with modifiers sometimes doing the same thing (e.g., +o149/6p).

I think you should assume that life is simple and (for the part before any modifier) assume it really only is

--<long_option>
--<long_option>=<arg>
--<long_option>=<long_directives>
--<long_option>=<long_directives>:<arg>

where <long_directives> can be <long_directive1>,<long_directive2>,<long_directives3>,... and <arg>may be <arg1/arg2/arg3/...>. If we can that code up and let it loose on the 1000+ test we will quickly find the corner cases that may require some design changes, some backwards compatible changes to the old options to make them conform to the plan, or allow a few special cases.

Finally, the long-option syntax is not defined beyond what you have been told so far. If we find some glaring problem that prevents the translation then we can regroup, but I think it should work with how it is set up. Apart from some nightmare options like -B it should mostly be simple things like-Fs15c/0 -> --size=panel:15c/0 and -Ff20c/25c -> --size=figure:20c/25c (from subplot).

And one more "finally": It may well be that you may learn that it would have been great if we had so and so flag or count or something in those structures to separate A from B or to know about C, etc etc. That is all up to us of course.

rbdavis commented 2 years ago

Ok, I think for now it may be more productive for me to focus on working on the modules for a while rather than digging into the translator code or trying to come to a full understanding of the entire option syntax system. It was definitely useful to get some idea of what's going on in the translation, but it may be more enlightening at this point to get wider exposure to usage of the existing short-option forms so that I can better understand what's actually necessary and/or possible without an undoably major reworking of existing short-option practice. I had actually done surface before you made your original posting to this repo, but will re-work that to conform to your example (or discuss any points I don't understand or feel there may be an alternate solution to).

How were you thinking that these per-module long-option changes would be integrated into the repo? Will we modify, test and integrate the modules one at a time, or do we do all of the modules on a separate branch and not integrate until all are complete? I can see benefits to the latter approach, but that could take a long time. Also, is it sufficient for testing purposes to just run the stock build tests (by hand-translating the existing tests into long-option form and re-running them), or do we need to do more exhaustive testing of various option combinations on each module? I should think quite possibly the latter, but obviously don't feel particularly qualified to design those tests at least for now. (I could certainly get started by running hand-translated versions of the standard tests.)

PaulWessel commented 2 years ago

It is OK for you to write some of the module array structures. I suggest this kind of workflow:

  1. You start with the most recent master version
  2. Add a suitable array structure like it is done in blockm*.c and mess with it until it works as you expect (or not)
  3. Once you reach a point where you think it is good to be included, make a pull request for me to look at.
  4. I will then check out that PR, make changes if I think they are needed (e.g., change a verb or a noun...) and then approve.
  5. You then merge into master.

This option (your first choice listed) is OK since all the long-option stuff is under the separate control of the cmake settings [add_definitions(-DUSE_COMMON_LONG_OPTIONS)] which is not defined for users, hence what we do with these changes has no effect on other people's GMT usage. Thus we can keep repeating the above for as many modules as we want, but should do one at the time to keep the PR and approvals straightforward. The alternative would have been to create a branch that we would keep adding commits to, and then having to merge in from master to the branch all the time to stay current with master. Doable too, but from prior branching I think it is always tricky to work on a branch for a very long time and avoid getting out of sync with master (since there will continue to be lots of bug fixes and other things going on there).

As for seeing examples of short-option syntax the best place would be to look in test/ and the test scripts there and see what those options look like for that module. Some have just a few tests while others (e.g., pscoast) have lots (due to bug fixing).

Makes sense?

rbdavis commented 2 years ago

That sounds good, Paul. I had forgotten about the cmake flag, that should allow us to minimize merge problems as you say while simultaneously keeping most users away from long-option usage for the foreseeable future (in case we feel the need to rework things as issues arise). I suppose clever users can circumvent that hurdle, maybe I'll add a comment that in the cmake template that premature usage of this is caveat emptor?

rbdavis commented 2 years ago

Hi Paul, here's what I have come up with for surface. I changed a number of my original tokens to match your suggestions, but left others for further debate (see below). You decide and just tell me which you prefer. In the meantime I will build and start testing this module, and I will change the tokens to whatever you want after you tell me. I used a little creativity on indentation here for readability -- everything's still under 80 columns, but even if they were a bit longer it still might be worthwhile. LMK what you think on that point and I will follow your style directives.

        GMT_INCREMENT_KW,       /* Defined in gmt_constants.h since not a true GMT common option (but almost) */
        { 0, 'A', "aspect",        "m", "middle",              "", "" },
        { 0, 'C', "convergence",   "", "",                     "", "" },
        { 0, 'D', "breakline",     "", "",                     "z", "zvalue" },
        { 0, 'L', "limit",         "l,u", "lower,upper",       "", "" },
        { 0, 'M', "maskradius",    "", "",                     "", "" },
        { 0, 'N', "maxiterations", "", "",                     "", "" },
        { 0, 'Q', "quicksize",     "r", "region",              "", "" },
        { 0, 'S', "searchradius",  "", "",                     "", "" },
        { 0, 'T', "tension",       "b,i", "boundary,interior", "", "" },
        { 0, 'W', "logfile",       "", "",                     "", "" },
        { 0, 'Z', "relax",         "", "",                     "", "" },
        { 0, '\0', "", "", "", "", ""}  /* End of list marked with empty option and strings */

Here are the unresolved (at least in my mind ;-> ) differences between your tokens and mine:

(1) You suggested --increments, but GMT_INCREMENT_KW is already pre-defined as --increment. Would you like to pluralize the latter #define?

(2) You specified [--mask=<radius>] and [--radius=<search_radius>]. However, as these are both radius parameters I think it's confusing to call only one radius -- if that's specified by itself it will not be clear which radius is being set.

(3) Typo in [--iternations=<n_iterations>], obviously, but I called it maxiterations to emphasize that it is really a maximum and not a guaranteed iteration count (at least by my reading of the man page).

(4) I used --quicksize to provide some description of which aspect of the processing is being enquickened. I know it's longer, and maybe not really necessary, but whatever.

Let me know what you think whenever you get around to it, in the meantime I will search for all surface tests and try to re-run them with hand-mangled longopts. This will be a good repo exploration minitask as I have not looked into any of the testing yet other than to simply run everything.

PaulWessel commented 2 years ago

[1] stick with the previous definition. [2-4]: I did not yet check what PyGMT and GMT.jl have selected for these. Unless they disagree we should stick with their choices - I can always review and override if I feel it is just wrong (Joaquim's english is fragile at times but the PyGMT have lots of native speakers even those there are plenty of furreighners there too). [4] THis option recently got enhanced so it is possible their versions do not have the new optional directive.

rbdavis commented 2 years ago

Testing surface changes today, with two questions at the moment:

(1) Seems like the -T option was intended such that you be able to specify interior and boundary tension independently, but I get an error when I try the shortopt strings -Ti0.5 -Tb0.9 saying that I cannot use multiple -T opts. Is this a case where you need to use the comma and say -Ti0.5,b0.9, in which case the longopt translation table should specify a comma separator instead of 0?

(2) Looks like the existing --verbosity common longopt is not working properly:

> gmt surface -V --aspect=2
gmt [INFORMATION]: Reformatted options: -V -A2
surface [ERROR]: Must specify -R option
> gmt surface --verbosity --aspect=2
surface [ERROR]: Must specify -R option

(-V forces display of the reformatted options but --verbosity does not.) Will be investigating -- possible bug in the translator as the common table entry for that looks OK, unless you have other ideas?

PaulWessel commented 2 years ago

Right, the -T issue is a relatively recelty introduced bug. You will see the use of n_errors += gmt_M_repeated_module_option (API, Ctrl->T.active); which was added in almost every place to ensure people are not allowed to set an option twice (usually). The few places were it is OK (-B, and -T as here) we did not add that. But here I goofed. As for the solution, see how the -L case is handled. Perhaps you can fix -T the same way, i.e.Ctrl->T.active[2] with separate setting for interior and boundary.

[2] Good one. See line 74 in gmt.c for the explanation. As for solution, perhaps the simplest one is (for now) just have if (modified) in gmt_init.c since nobody is using long-options yet. Alternative would be to handle --verbosity in gmt.c and perhaps that is getting messy.

rbdavis commented 2 years ago

I tried the following and it gave no error messages (other than those about other missing options):

> gmt surface -Ti0.5,b0.8

Are you saying that this does not actually work the way one would expect it to?

I was wondering if verbosity was a special flag that was skimmed off the top before full arg parsing was done. Guess that's what's happening. Will implement your if (modified)solution for now.

PaulWessel commented 2 years ago

Yes, pretty sure things are silently ignored as the parser seems to just pick the first letter and if i or b then set just that else set both. Most GMT option parsers do not do fool-proof checking and check there is no "junk" after the 0.5, for instance. We could of course do that but the attitude has usually been to not spend too much time on making it fool-proof.

rbdavis commented 2 years ago

OK, finally some progress ...

(1) I think I have fixed the -T problem but am uncertain about the existing code block there which does something with gmt_M_compat_check (GMT, 4). I am assuming for now that by the time thatif (gmt_M_compat_check (GMT, 4)) ... else code block is done, modifierwill be correctly set to b,i, or the first character of a numeric string. If this is so then I think my change will work. Here's the whole case -T: code block, please let me know if this looks correct:

               case 'T':
                            k = 0;
                            if (gmt_M_compat_check (GMT, 4)) {      /* GMT4 syntax allowed for upper case */
                                    modifier = opt->arg[strlen(opt->arg)-1];
                                    if (modifier == 'B') modifier = 'b';
                                    else if (modifier == 'I') modifier = 'i';
                                    if (!(modifier == 'b' || modifier == 'i'))
                                            modifier = opt->arg[0], k = 1;
                            }
                            else {
                                    modifier = opt->arg[0];
                                    k = 1;
                            }
                            if (modifier == 'b') {
                                    n_errors += gmt_M_repeated_module_option (API, Ctrl->T.active[BOUNDARY]);
                                    Ctrl->T.b_tension = atof (&opt->arg[k]);
                            }
                            else if (modifier == 'i') {
                                    n_errors += gmt_M_repeated_module_option (API, Ctrl->T.active[INTERIOR]);
                                    Ctrl->T.i_tension = atof (&opt->arg[k]);
                            }
                            else if (modifier == '.' || (modifier >= '0' && modifier <= '9')) {
                                    /* specification of a numeric string with no b or i directive will
                                       set both tension values, meaning that we must test that neither
                                       has already been set via some preceding -T specification */
                                    n_errors += gmt_M_repeated_module_option (API, Ctrl->T.active[BOUNDARY]);
                                    n_errors += gmt_M_repeated_module_option (API, Ctrl->T.active[INTERIOR]);
                                    Ctrl->T.i_tension = Ctrl->T.b_tension = atof (opt->arg);
                            }
                            else {
                                    GMT_Report (API, GMT_MSG_ERROR, "Option -T: Unrecognized modifier %c\n", modifier);
                                    n_errors++;
                            }
                            break;

(2) Verbosity turned out to have yet one more twist as the initial verbosity level was so low that the GMT_Report() returned without printing anything. Here is the fix:

#if 0
        if (modified && gmt_M_is_verbose (API->GMT, GMT_MSG_INFORMATION)) {     /* Echo the converted options */
                char *cmd = GMT_Create_Cmd (API, *options);
                GMT_Report (API, GMT_MSG_INFORMATION, "Reformatted options: %s\n", cmd);
        /* we actually cannot yet tell if gmt_M_is_verbose(... GMT_MSG_INFORMATION ...)
           will come to pass without working a bit harder as args are not yet fully parsed,
           and also note that at this point we will always be at GMT_MSG_WARNING or whatever
           default verbosity level the program is initialized with -- fix this later (or not)! */
#endif /* 0 */
        if (modified) {
                char *cmd = GMT_Create_Cmd (API, *options);
                GMT_Report (API, GMT_MSG_WARNING, "Reformatted options: %s\n", cmd);
                GMT_Destroy_Cmd (API, &cmd);    /* Free string */
        }
}

Will resume surface testing tomorrow.

rbdavis commented 2 years ago

Any reason you did not configure the --verbosity directives {quiet, error, warnings ...} in gmt_common_longoptions.h?

PaulWessel commented 2 years ago

No, either forgot or something. Please add those directive and thanks for noticing!

rbdavis commented 2 years ago

Issue and pull request just submitted on surface long-options.

I pre-tested all (I think) new long-option translations for surface, also ran the standard GMT tests after hand-editing the tests in test/surface to use all available long-options. No errors on the latter (unsurprisingly, as the long-option translation of the commands clearly translated them back to their original short-option forms).

Will now proceed to pscoast as you suggested. If you have any additional modules of interest to prioritize in this queue, just let me know!

rbdavis commented 2 years ago

Notice there are conflicts: See if you can handle those - use the resolve conflicts and manually do it. This happens when you submit a PR without first checking if there has been any changes to master, i.e., try to merge master into the PR before posting. Let me know if you need help.

Hmmm, not sure exactly what I did wrong? My recollection from doing my first test iteration on this process (from your initial instructions when we started a few weeks back) was that I (i) pulled from the master branch just prior to making changes, (ii) made the changes, (iii) created and switched to a new branch, (iv) published the branch, and finally (v) made the pull request. Is this not the correct way to do this? (I did steps #1 and #3 via manual git commands, and step #4 via GH desktop.) I think I did exactly the same on this round, but maybe I left something out?

The changes I made on this are all pretty localized, and almost all in areas where I would not expect anyone else to be working. I'm sure I can manually combine the conflicting versions with no trouble at all, but as far as managing it within git I have no idea what's the best way. My inclination would be to make manual copies outside of git of the files I changed, wipe out the entire src directory, switch back to master branch, re-pull on the master branch, merge my changes manually, and then make another new branch and commit to that. Should I proceed in that manner or do something else?

rbdavis commented 2 years ago

Down the road we will need to consider how to document these options in our documentation...

I was wondering about man pages on this, and whether or not you wanted to hold back on the documentation until we are ready to make long-option usage public? Besides the man pages, don't we also need to be working (or not?) on the full verbose usage/help message displayed by the modules themselves?

PaulWessel commented 2 years ago

The conflict: If you look at your changes I think you also added or changed many tabs here and there? So that makes it hard for git diff to figure out where things changed. It looks like most of the differences due to tab or space? Perhaps your number of spaces for tab had a different setting than mine and things looked out of alignment?

I do most conflict fixing by going back to Git Desktop, make sure I have selected the right branch (surface long-opts) then pull down the branches menu and at the bottom it says "Choose a branch to merge into surface_long-opts). I use that and select master, then do do the merge conflict and it lets me edit the conflicted file in the selected editor.

PaulWessel commented 2 years ago

Walking home from office now but will be online after dinner.

rbdavis commented 2 years ago

Breaking for breakfast now myself, but will be back after and try your merge instructions then!

rbdavis commented 2 years ago

On Wed, 13 Jul 2022, Paul Wessel wrote:

The conflict: If you look at your changes I think you also added or changed many tabs here and there? So that makes it hard for git diff to figure out where things changed. It looks like most of the differences due to tab or space? Perhaps your number of spaces for tab had a different setting than mine and things looked out of alignment?

You're correct, I did add a bunch of spaces to gmt...longoptions.h to keep things looking tidy when I added all the --velocity directives.

I do most conflict fixing by going back to Git Desktop, make sure I have selected the right branch (surface long-opts) then pull down the branches menu and at the bottom it says "Choose a branch to merge into surface_long-opts). I use that and select master, then do do the merge conflict and it lets me edit the conflicted file in the selected editor.

I'm in GHDesktop now. When I enter that menu and select master in the popup dialog, there is a button at the bottom that says 'Create a merge commit'. Is that what I want to do?

PaulWessel commented 2 years ago

Yes, and then you get a choice to do this in the editor you have chosen or choose.

rbdavis commented 2 years ago

Arrrgh... Apparently I've made quite the mess of this.

When I was given a choice on how to edit the file, it seemed to be Xcode (which I didn't want), default (which I had no idea what that might be) or a link that said something about use the command line. I clicked the latter and it opened a Terminal, but left me no clue what to do. Do I just type 'vi filename' there? But which file? There were 3 files changed, and I have no idea if GHDesktop expects me to work on the files where they were in the repo directory, or if it's got another version somewhere else, or what?

So, I said screw this, I'll start over and first try to set up vi as a default. I killed the Terminal window that was opened, then poked around the GHDesktop Preferences to see if I could figure out how to do that but failed at discovering anything useful on that point (i.e., setting vi as a default). I then went back to the beginning and tried to repeat the whole process, but when I click the 'Create a merge commit' button I now get a popup error that reads 'There are unresolved conflicts in the working directory.) and I appear to be stuck.

Any ideas?

On Wed, 13 Jul 2022, Paul Wessel wrote:

Yes, and then you get a choice to do this in the editor you have chosen or choose.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you wereassigned.[ADEMQVZTFH54KXDS4L7VU4DVT4JWXA5CNFSM53CFSCJ2YY3PNVWWK3TUL52HS4DFV REXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOI2GDDSQ.gif] Message ID: @.***>

PaulWessel commented 2 years ago

Did you try to open the gmt_common_longoptions.h in your src dir while the branch is selected? Look for the >>>> <<<< ==== borders. If not let me know.

rbdavis commented 2 years ago

On Wed, 13 Jul 2022, Paul Wessel wrote:

Did you try to open the gmt_common_longoptions.h in your src dir while the branch is selected? Look for the >>>> <<<< ==== borders. If not let me know.

I just opened that file in vi and see the following:

[ bunch of unchanged stuff at the top ... ]

<<<<<<< HEAD

[ new version of translation table defs here including added velocity directives ... ]

=======

[ old version of translation table defs here without new velocity directives ... ]

master

Presumably I edit away all the <<<..., ===... and >>>... lines and resolve all the old and new stuff between those lines?

After I do this for all changed files (if they contain this conflict markup), what do I do then?

Roger

PaulWessel commented 2 years ago

Yes, I think HEAD is git speak for the very latest (your branch). So get rid of === and >>> and what is in between and the <<<Head line and it should look like what you want, then commit and the conflict should go away

rbdavis commented 2 years ago

OK, so I clean up the files and do a 'git commit' while I'm in the surface-longopts branch? Then what? Sorry, but I've forgotten if there's other stuff I need to do to merge things into the master branch, or if this just bounces back to you as the PR reviewer and you do some more stuff to make that happen and then all is done.

On Wed, 13 Jul 2022, Paul Wessel wrote:

Yes, I think HEAD is git speak for the very latest (your branch). So get rid of === and >>> and what is in between and the <<<Head line and it should look like what you want, then commit and the conflict should go away

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you wereassigned.[ADEMQV5QDHMDIKRXTKPUNJDVT4O7JA5CNFSM53CFSCJ2YY3PNVWWK3TUL52HS4DFV REXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOI2GMHZQ.gif] Message ID: @.***>

rbdavis commented 2 years ago

The only file with the git markup was gmt_common_longoptions.h which I fixed and then did

git add gmt_common_longoptions.h git commit

There were no errors when I did this, but I did notice a bunch of other files mentioned when I did a 'git status' first (just to verify I was in the right branch). These all looked like stuff that other people have been working on from what I can see from the notification chatter in my e-mail, e.g.:

git status On branch surface-longopts Your branch is up to date with 'origin/surface-longopts'.

You have unmerged paths. (fix conflicts and run "git commit") (use "git merge --abort" to abort the merge)

Changes to be committed: modified: ../.github/workflows/docker.yml modified: ../.github/workflows/docs.yml modified: ../INSTALL.md modified: ../README.md modified: ../ci/build-gmt.sh modified: ../ci/download-coastlines.sh etc. ...

Will wait for your response before doing anything further on this point, and will start goofing with pscoast in the meantime.

I have to run out for an errand (hand-delivering a $14,000 check for our new PV addition because I'm too paranoid to trust it to USPS) and will be back in an hour. If I don't catch you around then have a good what's-left-of-your evening and see you tomorrow!

Roger

On Wed, 13 Jul 2022, rbd wrote:

OK, so I clean up the files and do a 'git commit' while I'm in the surface-longopts branch? Then what? Sorry, but I've forgotten if there's other stuff I need to do to merge things into the master branch, or if this just bounces back to you as the PR reviewer and you do some more stuff to make that happen and then all is done.

On Wed, 13 Jul 2022, Paul Wessel wrote:

Yes, I think HEAD is git speak for the very latest (your branch). So get rid of === and >>> and what is in between and the <<<Head line and it should look like what you want, then commit and the conflict should go away

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you wereassigned.[ADEMQV5QDHMDIKRXTKPUNJDVT4O7JA5CNFSM53CFSCJ2YY3PNVWWK3TUL52HS4DFV REXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOI2GMHZQ.gif] Message ID: @.***>

PaulWessel commented 2 years ago

OK. was writing this but never hit Comment:

If you commit to the branch then I should see it in GitHub that the conflict is gone. Once we are OK then you will click merge on the github page once tests have run and you have the green buttons

Yes, not clear how long I will be up. I will look at GitHub to see status etc.

PaulWessel commented 2 years ago

I see nothing on the branch on GitHub so you have pushed your changes to the branch. After I make changes locally to a branch I usually use the desktop to commit (but CLI is fine) but the next step is tp push to origin in github desktop (or a CLI equivalent). Only then will I see anything on GitHub changing.

PaulWessel commented 2 years ago

I should be awake until 11pm-ish here but if we dont get to finish we can do so tomorrow. I am pretty sure you are missing the push to origin so that GitHub knows about the changes and can clear the conflict.

I guess the lesson is to be careful with tabs and spaces. I fight this myself. My Sublime Text apparently decides based on the first few lines in the file whether to use tabs or groups of spaces as default when I hit TAB. Drives me insane and if I am not careful to switch I will get up with commit like yours which leads to these large diffs which can lead to conflicts that are best avoided.

rbdavis commented 2 years ago

OK, understood. I tried to match tab vs. space usage in the file, i.e., where there were already a bunch of adjacent spaces and I wanted to increase spacing to keep the columns aligned I added spaces rather than tabs. Guess it still wasn't happy about that!

On Wed, 13 Jul 2022, Paul Wessel wrote:

I should be awake until 11pm-ish here but if we dont get to finish we can do so tomorrow. I am pretty sure you are missing the push to origin so that GitHub knows about the changes and can clear the conflict.

I guess the lesson is to be careful with tabs and spaces. I fight this myself. My Sublime Text apparently decides based on the first few lines in the file whether to use tabs or groups of spaces as default when I hit TAB. Drives me insane and if I am not careful to switch I will get up with commit like yours which leads to these large diffs which can lead to conflicts that are best avoided.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you wereassigned.[ADEMQV3D56V6U3Y6YX47OYTVT4XOZA5CNFSM53CFSCJ2YY3PNVWWK3TUL52HS4DFV REXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOI2G26NQ.gif] Message ID: @.***>

rbdavis commented 2 years ago

Any progress on determining the longoptions strings currently in use for Python/julia? It would probably be good to know where in the documentation and/or source these can be found before I work through too many more modules and then have to go back and correct inconsistencies.

rbdavis commented 2 years ago

Re: module verbose usage messages -- if you like I could #ifdef the extended usage message to supply longoption documentation insofar as USE_COMMON_LONG_OPTIONS or USE_MODULE_LONG_OPTIONS are defined. Shall I do this?

I'm wondering in general how much of a reveal you want to do on this stuff before all is finished. This includes both the usage messages and the man pages. The downside of making any of this publicly visible before it's complete is that we may get 3/4 of the way through and then realize we need to go back and make changes for purposes of consistency, etc. This process could take a while and if users start building scripts with documented longoptions and we pull the rug out from under them there could be a howl of protest?

rbdavis commented 2 years ago

Just looked at the PR on surface and it looks like you ok'd all, so I just did the S&M and assume this is now complete!

rbdavis commented 2 years ago

Some time after doing the surface-longopts S&M I got a github error e-mail message:

Subject: [GenericMappingTools/gmt] Run failed: Docker - master (eed9339)

Ignore this?

rbdavis commented 2 years ago

Finished grinding through pscoast options. Generally easier than working through surface with the added understanding gained from work on that module, but still lots of per-option detail questions for you. I have appended below my proposed translation table for you to look over. There are two things of note there.

First, I have experimented with a new formatting style for these mega-long-line tables. I did this at first just to make it more legible for you when shoehorned into this github page, but I think it really makes the code more generally readable period. The general form is to use three lines per option, but options with no directives or modifiers are compressed to a single line without loss of legibility, I think. Shall I permanently adopt this translation table format style for the real source code?

Second, I have posed my per-option questions for you by scattering them through the table as quasi-comments whose lines begin with ??. I did this mostly to make it easier for myself to edit the table into final form after I read your responses.

Anyway, here's the pscoast translation table with new formatting style and embedded questions:

static struct GMT_KEYWORD_DICTIONARY module_kw[] = { /* Local options for this module */
        /* separator, short_option, long_option, 
                  short_directives,    long_directives, 
                  short_modifiers,     long_modifiers */
        { 0, 'A', "area",
                  "",                  "",
                  "a,l,r,p",           "antarctica,lakes,riverlakes,percentexcl" },
        { 0, 'C', "wetfill",
                  "",                  "",
                  "l,r",               "lakes,riverlakes" },
??      /* does -C also offer all -G fill parameters? */
        { 0, 'D', "resolution",
                  "",                  "",
                  "f",                 "lowfallback" },
??      /* -E not usable because of = usage within parameters? */
        { 0, 'F', "framepen",
                  "l,t",               "scale,rose",
                  "c,g,i,p,r,s",       "clearance,fill,inner,pen,radius,shade" },
        { 0, 'G', "dryfill",
                  "p,P",               "pattern,invpattern",
                  "b,f,r",             "background,foreground,dpi" },
        { 0, 'I', "rivers",            "", "", "", "" },
        { 0, 'J', "zaxis",
                  "z,Z",               "scale,width",
                  "",                  "" },
        { 0, 'L', "mapscale",
                  "g,j,n,x",           "mapcoords,jcode,boxcoords,plotcoords",
                  "w,a,c,f,j,l,o,u,v", "length,align,loc,fancy,janchor,label,anchoroffset,units,vertical" },
        { 0, 'M', "multisegment",      "", "", "", "" },
        { 0, 'N', "boundaries",        "", "", "", "" },
        { 0, 'Q', "markclipend",       "", "", "", "" },
??      /* -S same parameters as -C or -G or what? */
??      /* -Td and -Tm not doable because they have inconsistent short modifiers? */
        { 0, 'W', "shorelines",        "", "", "", "" },
??      /* -d (i) Synopsis usage, (ii) Optional Arguments usage, and
??         (iii) general -d usage all inconsistent; also, why does
??         -d in gmt_common_longopts.h not translate +c short modifier? */
??      /* why does -t in gmt_common_longopts.h not translate +f and +s short modifiers? */
??      /* why are -^, -+ and -? not in gmt_common_longopts.h? */
        { 0, '\0', "", "", "", "", ""}  /* End of list marked with empty option and strings */
};
PaulWessel commented 2 years ago

Just looked at the PR on surface and it looks like you ok'd all, so I just did the S&M and assume this is now complete!

Yes, looks good - no complaints.

PaulWessel commented 2 years ago

Ignore this?

Yes. Merging causes all sorts of CI action and I often gets lots of emails about failed runs etc. The only ones that matters are the ones that are run automatically when we do a PR - ie., getting gree buttons. Sometimes there are failures there too and you can descend into them to see why - often due to unrelated issues on that server.

PaulWessel commented 2 years ago

Comments back on the longer one above:

  1. Yes, the layout of the code is fine with me and will please Joaquim who has a 80 char screen (kidding). I have a 40" TV so there are no long lines!
  2. Both -S and -G are identical except wet vs dry. They can both set fill or set clip.
  3. The -C does no clip but can set lake color, plus it has modifiers
  4. I will think about the trouble options -E -T -d and let you know.
  5. The -t grew some modifiers last 2 years so there may have been some lag. Some modifiers are not available in all modules. There are also at least one non-common option that is used a lot (-Tmin/max/inc for generating arrays) that is similar. I will look into this and advice. For now ignore.
  6. THe -^ and -? was probably just ignored as not important, but they should get long versions too. Maybe --synopsis and --help. There is also -+ which is help without the common options. Most of us simply use - and + and (no arg) for the full help. Maybe --short-help for -+.

As for the long-option words: -A and --area is fine. I see pyGMT use area_thresh which I dont like while GMT.jl uses area. -C needs to be --lakes -G and -S: Since they either set clip or pain I think we should not have "fill" in the option, just --land and --water. PyGMT and GMT.disagree both use water and Julia adds ocean as an alias. -D. Good, I think +fallback is shorter and retains the f starting letter Fill arg: No way to know for you, but the p|Ppattern stuff is not considered a directive. I know - not consistent. We will for now treat that as just a string argument to be parsed by the _gmtgetfill function. There are so many ways to set a color (##hex is one) so we dont want to go there. So no directives and modifiers related to any color (C -G -S) -M. multisegment is just a format for our tablles. Joaquim picked dump which is not so nice. Perhaps --list ?

As you can see, messy with all the names we need to come up with. We will go voer all these again when we get to full testing stage and get input from the pyGMT and GMT.jl peopl.

rbdavis commented 2 years ago

Maybe --short-help for -+.

Given the profusion of non-alphanumerics in GMT option strings I would avoid the use of any such (including hyphens) in the longopt strings as it makes them harder to visually parse when there's any following material. How about --shortHelp, i.e., any multi-word longopts string will capitalize the first letter of every non-initial word in the string? I know that hyphens are commonly used in many GNU, etc., longopts, but given GMT's option complexity I would opt for minimizing syntactic confusion over trying to look like GNU (which I think GMT options will never do anyway). Let me know what you want to do.

So no directives and modifiers related to any color (C -G -S)

So, you just want

{ 0, 'C', "lakes", "", "", "", "" },
{ 0, 'G', "land", "", "", "", "" },
{ 0, 'S', "water", "", "", "", "" },

Any thoughts on my documentation questions:

Re: module verbose usage messages -- if you like I could #ifdef the extended usage message to supply longoption documentation insofar as USE_COMMON_LONG_OPTIONS or USE_MODULE_LONG_OPTIONS are defined. Shall I do this?

I'm wondering in general how much of a reveal you want to do on this stuff before all is finished. This includes both the usage messages and the man pages. The downside of making any of this publicly visible before it's complete is that we may get 3/4 of the way through and then realize we need to go back and make changes for purposes of consistency, etc. This process could take a while and if users start building scripts with documented longoptions and we pull the rug out from under them there could be a howl of protest?

PaulWessel commented 2 years ago

First a post on --shortHelp etc:

I wrote too quickly. In Julia, short-help would be seen as a subtraction between two variables so to prevent Joaquim from using any longoption name we come up with it cannot have hyphens anyway. I see GMT.jl tends to use underscores. I remember now a discussion where the team felt we should rack out brains a bit and try to come up with synonyms whenever we feel the urge to use some_name. There usually is another word without hyphen/underscore.

I am not sure about using capitals. I suggest we do --shorthelp for now.

WHile the repo is available to anyone who wants and they can mess around with cmake configures, none of what we do in master is official until we issue 6.5.x etc. So I have no concerns about that. The only concern we have is making sure we do not introduce breaking changes into master. As for the docs it is a bit different since the CI builds the latest dev doc each night. So if we add --longoptions to the RST docs then the next day someone will click on the dev docs link (it is first ahead of 6.4) and they will get confused. Hence we hold off with the docs for now.

As for --lakes. That one does have two modifiers +l and +r (basically you can give -C twice, e.g., -Cred+l -Cblue+r), with +lake and +river. Joaquim called -C --river which is not accurate. It is a lake option which can be tuned to set the color or river-lakes (lakes part of rivers that are too big to be represented by a line (like in -I) and must be a polygon).

rbdavis commented 2 years ago

OK, so:

(1) no non-alphabetics or capitals in longopts for now;

(2) no documentation (either man page or in-module usage message) for now; and

(3) -C (--lakes) gets +l and +r modifiers, while -G and -S get none.

PaulWessel commented 2 years ago

Yep, looks good. FYI, I will work on something for -Ggridoutput since it is used in tons of modules so it should be defined once like the GMT_INCREMENT_KW in gmt_constant.h. There are others like the -T I mentioned.

rbdavis commented 2 years ago

Thanks, I should finish up pscoast pretty soon, if you have any priorities re: the rest of the modules just let me know.

PaulWessel commented 2 years ago

I dont have a good system for this but we should mix things up between tools that do processing and plot. Perhaps

pswiggle grdimage grdvolume gmtselect

My thought on how to get through 150 modules: You and I do maybe 10%. By that time we have learned a lot and can then ask for volunteers to help by following a script and the existing examples. As you can imagine, it is going to get boring to do 150.

rbdavis commented 2 years ago

Will do in that order.

I was wondering how long it would take to do all of these, and definitely would not mind not doing all of them. However, it's also useful for me to wade through the GMT option universe for a while longer, I think, so I'm definitely not complaining yet. I'll let you know when I feel like I've absorbed enough.

rbdavis commented 2 years ago

OK, what's up with coast vs. pscoast? I see this has something to do with classic and modern modes?

Until now I have been using the Modules link on the doc page to get to module command syntax descriptions (i.e., man pages), which I now presume is actually a reference to modern modules usage, whereas the Modules (Classic Mode) link goes to the classic pages. I just finished editing and compiling pscoast.c, but when I tried

>  gmt coast --verbosity
gmt [ERROR]: Shared GMT module not found: coast

I got the bizarre error you see above. (This seems to be triggered by the --verbosity longopt, as without it the module invocation seems to work.)

I eventually figured out that I need to do

>  gmt pscoast --verbosity
gmt [WARNING]: Reformatted options: -V

which behaves better (at least in this one case).

I don't see any coast.c, so what source is the coast module built from? Is that also built from pscoast.c but somehow invoked within a different environment or something? If I have edited pscoast.c to support longopts, should they now be working in both pscoast and coast?

I'm seeing other strange behavior. I added -^ to the common longopts, and although pscoast seems to accept that argument in short form

>  gmt pscoast -^
gmt pscoast [core] 6.5.0_c723991-dirty_2022.07.05 - Plot continents, countries, shorelines, rivers, and borders

it acts very strangely with the long form:

>  gmt pscoast --synopsis
gmt [WARNING]: Reformatted options: -
pscoast [ERROR]: Unrecognized option -^

Here, (i) the Reformatted options output looks mangled (missing ^ ), but (ii) although pscoast itself apparently sees the complete -^ argument, it claims that it does not recognize it as a valid option and never prints the desired usage message. I will be running this in the debugger now, but if you have any insight into any special behavior of the -^ option I'd love to hear.

Looking back at surface, when I look at the module documentation for that it seems that both the modern and classic documentation links reference the exact same man page. The coast and pscoast man pages are different although I have not yet looked closely to figure out how.

rbdavis commented 2 years ago

Testing other aspects of the new pscoast longopts right now. Just realized I overlooked directives for -D. Shouldn't this translation be

        { 0, 'D', "resolution",
                  "f,h,i,l,c,a",       "full,high,intermediate,low,crude,auto",
                  "f",                 "lowfallback" },

Also, just encountered a new class of translation bug to watch for: you can never match a module-specific longopt string if any consecutive subset of its leading characters (starting from the first character) matches a common longopt string. For instance, a module-specific longopt string --mybadself will never be found in the translation search if there is a common longopt string --mybad. Just discovered this with the new --framepen from pscoast (which I have now renamed --frmpen, please advise if you want something else) which is hidden by the --frame common longopt.

rbdavis commented 2 years ago

Figured out what's up with -^ and --synopsis, etc. I imagine you already know (or at some point knew) this, but -^ , -+ and -? are special shortops that seem to trigger their intended behavior only when they are the first argument supplied, e.g.,

gmt pscoast -^

prints the expected usage message but

gmt -pscoast -V -^

does not. Also, these special args are processed (and the module exits) before any other args are parsed. This happens in gmt_report_usage(). I can certainly add special code there to look for the longopts you suggested (probably immediately prior to line #19924 in gmt_init.c). Shall I do that?

In the meantime I'll strip those longopts out of the common.h file as they serve no purpose there.