Better standardization of usage options before long-option decisions

PaulWessel commented 3 years ago

After the common options, the next tier are the options that are "near-common", e.g., -Ggrdfile, -Ccpt, and many others. However, in researching what is used and sorting on occurrence I get a typical list like this:

   2 [-N<trktable>]
   2 [-N<table>]
   2 [-N<nodefile>]

In all cases these are options that takes the name of a table. I think all of these should become -Ntable and then the usage message can provide any context needed. This would clean up the short usage messages. However, I note this does not translate easily to long-options. There, --trackfile=table, --new-data=table, --nodefile=table may be more natural since it immediately provides context directly in the synopsis. I think this means the non-common long-format options may in many/most cases be uniquely worded to the module, hence there will be less reuse of keyword/value pairs than I initially imagined.

gd-a commented 3 years ago

what about --input=table ? (and separate handling only for --colormap=cpt)

PaulWessel commented 3 years ago

Remember these (in the case of -Ntable) are extra input to modules that already is reading their main input, so cannot use --input I think. I guess the problem is that context is often conveyed in the longer names, so things like --secondary-input=file is not terribly more informative than -N. But --node-file=table clearly tells you something about that option, but that only applies to one of the modules, etc.

gd-a commented 3 years ago

I guess there's a balance between being very specific and not enough. --secondary-input could be the generic name the same way --main-input (just input please) is. Then you rely on the documentation to highlight if it is a node, a grid, a table or whatever.

This way would be consistent with a more "standardized/generic" way of calling functions IMO.

edit : it could also be given in comma-separated list --input=main,table,...

PaulWessel commented 3 years ago

I guess there's a balance between being very specific and not enough. Yes, I agree. However, --input=main,table etc would not work since gmt modules reading tables can take any number of "main" inputs. But I will consider --secondary-input or similar. I am just doing a survey of all non-common options to look for patterns.

joa-quim commented 3 years ago

I call to attention that long names separated by an minus, e.g. node-file, will not be used by any wrapper, where that - is interpreted as a subtraction.

gd-a commented 3 years ago

I am just doing a survey of all non-common options to look for patterns.

In that case, I guess you're right about having fewer similar keywords across modules as each would be very specific. It could be interesting to know for each flag how many modules uses it. That could probably help for the survey.

PaulWessel commented 3 years ago

Here are the non-common options that, across all modules, were used exactly as is in 4 or more places (number is how many times found):

  34 -G<outgrid>
  13 -C<cpt>
  11 -G<fill>
   9 -W<pen>
   8 -T<TAG>
   7 -F[+c<clearance(s)>][+g<fill>][+i[[<gap>/]<pen>]][+p[<pen>]][+r[<radius>]][+s[<dx>/<dy>/][<fill>]]
   7 -E<rottable>|<ID1>-<ID2>|<plon>/<plat>/<prot>[+i]
   6 -N<upper_age>
   5 -S<radius>
   5 -I[<intens>]
   5 -H[<scale>]
   5 -G<outfile>
   4 -W[<pen>][<attr>]
   4 -W[+s|w]
   4 -T<time>
   4 -Qe|i
   4 -Ia|c|m|t
   4 -F<flags>
   4 -E<fill>
   4 -D<resolution>[+f]
   4 -A<min_area>[/<min_level>/<max_level>][+a[g|i][s|S]][+r|l][+p<percent>]

As you can see, some high entries has to do with supplements (-T in x2sys, -E in spotter). Many did not make the list because they have slight variations on the theme throughout (-T[<min>/<max>/]<inc>[+i|n] vs -T[<min>/<max>/]<inc>|<file>|<list>[+a][+e|i|n], -C[<cpt>] vs -C<cpt>, etc.). So I will be refining the list to count those variants as the same option even though not all modifiers are valid in all modules. I think as @joa-quim found and has largely implemented, it is hard to make common keywords beyond the common options, but some are clear (and already treated that way in GMT.jl and presumably PyGMT). The chosen keyword will not always match what @joa-quim's list has but (a) he can add aliases and (b) it may not matter. The goal is to get the best keywords we think explains the purpose of the options. Here are a few suggestions based on the above list:

--outgrid (and possibly --ingrid). We use -G mostly for output grids but there are places where we need to specify secondary input grids via an option.
--colormap. This is of broad use even though there are a few modules that can take more than one CPT. I spell out colormap rather than Julia's cmap as I want it to be clear (the whole point of long options, no).
--pen. For generic -W pens, but many modules sets several pens and then it gets local, e.g., --trackpen, --wigglepen)
--fill. This is good for modules setting a single item fill, but many sets fill for specific items so then more local names are needed.
--panel. I think this is a good word for the background panels we set in, say, legend via -F. I dont think --box is as descriptive (the Julia choice).
--position. This is one of Julia aliases, and can be used for placing legends, scales, etc.
--array. I think this will be a good one for defining a 1-D array, such as with -Tmin/max/inc|file|list etc.

Anyway, there are more that can be considered "near-common" options and get a consistent definition across GMT. I will do some more analysis as alluded to above to better count the variants.

maxrjones commented 3 years ago

--panel for -F: this is already used for the -c common option: https://docs.generic-mapping-tools.org/dev/devdocs/long_options.html. PyGMT uses box, similar to GMT.jl.

--array for -T: PyGMT uses series here while GMT.jl most commonly uses range. While GMT.jl will support multiple aliases, PyGMT does not so there would need to be a deprecation if there were to be agreement in the names between PyGMT and GMT. Would series be suitable, or do you think array is more descriptive?

PaulWessel commented 3 years ago

I guess we are stuck with box; I don't have to like it.
I can live with --series instead of -array, that will work OK.

joa-quim commented 3 years ago

I don't particularly like box either but for example basemap has -Fbox and I probably got it from there.
PyGMT cannot use range because it's a protected word in Python, but range is a much better word for the purpose than series

gd-a commented 3 years ago

subframe ?

joa-quim commented 3 years ago

border, borderline

PaulWessel commented 3 years ago

Updated list of sorted non-common options after some standardization [I have removed options only common to a supplement]:

  34 -G<outgrid>
  13 -C<cpt>
  12 -T<array>]
  11 -G<fill>
   9 -W<pen>
   8 -H[<scale>]
   7 -F[+c<clearance(s)>][+g<fill>][+i[[<gap>/]<pen>]][+p[<pen>]][+r[<radius>]][+s[<dx>/<dy>/][<fill>]]
   6 -W[<pen>]
   5 -S<radius>
   5 -I[<intens>]
   5 -G<outfile>
   4 -W[+s|w]
   4 -F<flags>
   4 -E<fill>
   4 -D<resolution>[+f]
   4 -A<min_area>[/<min_level>/<max_level>][+a[g|i][s|S]][+r|l][+p<percent>]
   3 -W<weight>
   3 -T<time>
   3 -S<header>
   3 -N[c|r]
   3 -N[a|f|m|r|s|<n_columns>/<n_rows>][+a|d|h|l][+e|m|n][+t<width>][+v][+w<suffix>][+z[p]]
   3 -I[<intensgrid>|<value>|<modifiers>]
   3 -G<zlo>/<zhi>
   3 -E<empty>
   3 -D[+x<xname>][+y<yname>][+d<dname>][+s<scale>][+o<offset>][+n<invalid>][+t<title>][+r<remark>][+v<name>]
   3 -D<template>

PaulWessel commented 3 years ago

We need to validate that all GMT short-format options follow this template:

-<short_option>[<short_directives>][+<short_modifiers>[<argument>]]

which is what the translation of

--<long_option>[=[<long_directives>:]<arg>][+<long_modifier1>[=<arg1>]][+<long_modifier2>[=<arg2>]]

will produce. Off the top of my head the -SE- (degenerate ellipse) option comes to mind as not being compliant and may need a modifier, e.g. -SE+d for degenerate, for instance. I am sure there are still others even though I have changed lots of these over the last fe years.

maxrjones commented 3 years ago

We need to validate that all GMT short-format options follow this template:

-<short_option>[<short_directives>][+<short_modifiers>[<argument>]]

which is what the translation of

--<long_option>[=[<long_directives>:]<arg>][+<long_modifier1>[=<arg1>]][+<long_modifier2>[=<arg2>]]

will produce. Off the top of my head the -SE- (degenerate ellipse) option comes to mind as not being compliant and may need a modifier, e.g. -SE+d for degenerate, for instance. I am sure there are still others even though I have changed lots of these over the last fe years.

grdhisteq -N already seems to be long-format: https://docs.generic-mapping-tools.org/dev/grdhisteq.html#n (referring to appending norm rather than a directive or modifier)

PaulWessel commented 3 years ago

Hm, no, bad docs I think. Norm is a value that will be used in the normalization, so the -1/+1 is probably not correct. It may be -norm/+norm but not sure.

GenericMappingTools / gmt

Better standardization of usage options before long-option decisions #5563