Naming Conventions, which one?

szaghi commented 8 years ago

This is probably one of the most personal guideline.

I have to admit that I have not yet found yhe one that I love so I often change my mind resulting into somehow disorder style (that really hurts my soul).

I read about many nice idea. I will try to report them here (I need some times, but do not wait me, start suggestiong your convention) in order to promote our discussion.

From Richard Maine (stated here)

Use verb forms for subroutine names and noun forms for functions, e.g. a function to return some useful thing might be named useful_thing while a subroutine would be better named something like get_useful_thing.

From one of the @cmacmackin guides

Names should be descriptive, especially for publicly visible entities. Underscores should be used to separate words if a name is more than two words long. If a name is only two words long then an underscore may be used at your discretion. Sometimes they make the name more readable, sometimes they just make it longer. As with anything else in the code, use lower cases characters only.

For identifiers avoid to exploit Case to indicate different words within them when Fortran is case-insensitive, the language doesn't recognize it.

submodule could be useful for type/module names disambiguation (and for avoid recompilation cascade penalties when implementation changes, but not the interface), but add verbosity: should be not recommended for small project.

From @certik best practices

Use lowercase for all Fortran constructs (do, subroutine, module, ...).

as consequence from lowercase adoption, exploit underscores rather than camelCase.

but the except is to follow short mathematical notation for mathematical variables/functions (Ylm, Gamma, gamma, Enl, Rnl, ...).

For other names use all lowercase: try to keep names to one or two syllables; if more are required, use underscores to clarify (sortpair, whitechar, meshexp, numstrings, linspace, meshgrid, argsort, spline, spline_interp, spline_interpolate, stoperr, stop_error, meshexp_der).

camelCase for variables comes from Java and I pretty strongly do not recommend it.

From @milancurcic best practices

reserved Fortran words in all uppercase - this is what I have mostly been doing in the past, however I recommend all lowercase for reserved words just as well.

variables lower-case

underscores for multi-word variables

camelCase for procedure names

TYPE :: datetime

  INTEGER :: year        = 1 ! Year                   [1-HUGE(year)]
  INTEGER :: month       = 1 ! Month in year          [1-12]
  INTEGER :: day         = 1 ! Day in month           [1-31]
  INTEGER :: hour        = 0 ! Hour in day            [0-23]
  INTEGER :: minute      = 0 ! Minute in hour         [0-59]
  INTEGER :: second      = 0 ! Second in minute       [0-59]
  INTEGER :: millisecond = 0 ! Milliseconds in second [0-999]

  REAL(KIND=real_dp) :: tz = 0 ! Timezone offset from UTC [hours]

  CONTAINS

  PROCEDURE :: addMilliseconds
  PROCEDURE :: addSeconds
  PROCEDURE :: addMinutes
  PROCEDURE :: addHours
  PROCEDURE :: addDays
  PROCEDURE :: isocalendar
  PROCEDURE :: isoformat
  PROCEDURE :: isValid
  PROCEDURE :: now
  PROCEDURE :: secondsSinceEpoch
  PROCEDURE :: strftime
  PROCEDURE :: tm
  PROCEDURE :: tzOffset
  PROCEDURE :: utc
  PROCEDURE :: weekday
  PROCEDURE :: weekdayLong
  PROCEDURE :: weekdayShort
  PROCEDURE :: yearday

ENDTYPE datetime

From @rouson best practices

prefer all lowercase

prefer underscores over camelCase => add_milliseconds over addMilliseconds

use mixed case in situations where underscores are not allowed, e.g., in defined operators and in name= field of bind(C) pricedures when applicable:
type foo
contains
   procedure :: negate_foo
   generic :: operator(.negateFoo.) => negate_foo
end type
procedure names must have an action verb in them.

logical variables must be named to reflect the state they are used to convey, most with the verb to be, e.g. :

lib_is_initialized vs lib_init;

obj_has_parent vs obj_parent;

do not strive to much in local shortening of variables name, prefer meaningful name: local shortening is often possible with the associate construct.

avoid single character suffix for types and modules, e.g. _t and _m, because:

a single character has the lowest information entropy possible in a language. Probably for this reason, conveying meaning with a single character is bound to confuse anyone who is not already familiar with the convention.

I have no problem with borrowing from other languages, but there just aren’t enough Fortran programmers doing this for it to be widely a widely recognizable convention.

on the contrary prefer suffix like _interface to module names and _implementation to submodule names.

From @victorsndvg best practices

exploit camelCase over underscores;

prefer a complete meaningful action-descriptive name for procedures, even if the resulting name is very very long, e.g. PerformAnActionWithThisVariableOfTypeXXXWhileThisContext

prefix type bound procedures (TBP) with the type name for allowing an easy refactoring of codes (it becomes very easy to move class from a module into a new one without change the procedures names), e.g.
type :: my_object
contains
  procedure :: action => my_object_PerformAnActionWithThisVariableOfTypeXXXWhileThisContext
end type
From @zbeekman best practices

exploit camelCase for procedures to be able to immediately identify an entity as a procedure.

names must have meaning.

procedure names must have an action verb in them.

named constants (parameters?) are in UPPERCASE.

logical variables must be named to reflect the state they are used to convey, most with the verb to be, e.g. :

lib_is_initialized vs lib_init;

obj_has_parent vs obj_parent;

append _t to derived type names and _m to module names to help namespace: you can have modules with effectively the same name as the class they implement, and one could even declare an object with the same name as the derived type/class if one wanted to do so; also the _t is a common convention in some other languages like C.

From @LadaF best practices

prefer undescores over camelCase due to consistency with the Fortran standard naming convention and the case insensitivity of names Fortran.

structure constructors and other similar function should not have any verb in them, their name is just what they return, due to consistency with the Fortran standard default and overloaded structure constructors naming.

avoid module suffixes, expect (maybe) for type names disambiguation, e.g.
type(Sphere) :: sphere ! won't work
type(spere_t) :: sphere ! works
From @tclune best practices

prefer undescores over camelCase due to consistency with the Fortran standard naming convention and the case insensitivity of names Fortran.

however, camelCase could be admissible for module names and type names; in fact, type names are proper nouns in the language, and as such, I at least one to capitalize them; moreover, camelCase helps to accentuate this when the noun is multiple words; but more importantly it often solves type names disambiguation, e.g.
type (SphereShape) :: sphere_shape ! or just sphere
append _t to derived type names to disambiguation aims could be help, it could be considered superfluous, i.e. my usual pattern is to define one “class” per Fortran module prefix pkg_ (over suffix with _mod) for all modules in the package, e.g.
module pkg_SphereShape_mod
...
type :: SphereShape

milancurcic commented 8 years ago

For quite a few years I have personally preferred the style used in the Fortran standards - reserved Fortran words in all uppercase, variables lower-case, underscores for multi-word variables, camelCase for procedure names, for example:

TYPE :: datetime

  INTEGER :: year        = 1 ! Year                   [1-HUGE(year)]
  INTEGER :: month       = 1 ! Month in year          [1-12]
  INTEGER :: day         = 1 ! Day in month           [1-31]
  INTEGER :: hour        = 0 ! Hour in day            [0-23]
  INTEGER :: minute      = 0 ! Minute in hour         [0-59]
  INTEGER :: second      = 0 ! Second in minute       [0-59]
  INTEGER :: millisecond = 0 ! Milliseconds in second [0-999]

  REAL(KIND=real_dp) :: tz = 0 ! Timezone offset from UTC [hours]

  CONTAINS

  PROCEDURE :: addMilliseconds
  PROCEDURE :: addSeconds
  PROCEDURE :: addMinutes
  PROCEDURE :: addHours
  PROCEDURE :: addDays
  PROCEDURE :: isocalendar
  PROCEDURE :: isoformat
  PROCEDURE :: isValid
  PROCEDURE :: now
  PROCEDURE :: secondsSinceEpoch
  PROCEDURE :: strftime
  PROCEDURE :: tm
  PROCEDURE :: tzOffset
  PROCEDURE :: utc
  PROCEDURE :: weekday
  PROCEDURE :: weekdayLong
  PROCEDURE :: weekdayShort
  PROCEDURE :: yearday

ENDTYPE datetime

I thought this one was quite nice because it is easy to distinguish Fortran constructs from variables, making them overall more visible to the eye. However, over the past year or two during which I've been reading a lot of other people's code written mostly in lower-case, my eyes and brain got re-trained to read lowercase more easily, and I have found myself writing mostly lowercase for Fortran constructs and variables, while still employing camelCase where appropriate.

For the same reason I do not prefer all uppercase code, I also do not prefer exclusively lowercase code. Fortran syntax is not case-sensitive, and by intelligently employing uppercase characters we can improve the readability of the code.

rouson commented 8 years ago

I started out with upper case for keywords, but eventually concluded that

(1) It’s unnecessary if one is using an editor or IDE with syntax-aware coloring. I get great coloring of keywords in vim and different types of keywords have different colors so there is more information than in the binary choice of lower versus upper case. And all the colors are pleasing to my eyes. :) (2) All caps (even just for keywords) makes Fortran look very old-fashioned and the language already has such a reputation so I see no reason to reinforce it.

I think many people will find underscores more quickly readable than mixed case. I therefore prefer add_milliseconds over addMilliseconds.

I use mixed case in situations where underscores are not allowed, e.g., in defined operators:

type foo contains procedure :: negate_foo generic :: operator(.negateFoo.) => negate_foo end type

and of course I use mixed case in the “name=“ field of bind(C) procedures when applicable. Otherwise, lower case + underscores mostly serves my needs and has some readability benefits.

Damian Rouson, Ph.D., P.E. President, Sourcery Institute http://www.sourceryinstitute.org +1-510-600-2992 (mobile)

On Jan 11, 2016, at 9:40 PM, Milan Curcic notifications@github.com wrote:

For quite a few years I have personally preferred the style used in the Fortran standards - reserved Fortran words in all uppercase, variables lower-case, underscores for multi-word variables, camelCase for procedure names, for example:

TYPE :: datetime

INTEGER :: year = 1 ! Year [1-HUGE(year)] INTEGER :: month = 1 ! Month in year [1-12] INTEGER :: day = 1 ! Day in month [1-31] INTEGER :: hour = 0 ! Hour in day [0-23] INTEGER :: minute = 0 ! Minute in hour [0-59] INTEGER :: second = 0 ! Second in minute [0-59] INTEGER :: millisecond = 0 ! Milliseconds in second [0-999]

REAL(KIND=real_dp) :: tz = 0 ! Timezone offset from UTC [hours]

CONTAINS

PROCEDURE :: addMilliseconds PROCEDURE :: addSeconds PROCEDURE :: addMinutes PROCEDURE :: addHours PROCEDURE :: addDays PROCEDURE :: isocalendar PROCEDURE :: isoformat PROCEDURE :: isValid PROCEDURE :: now PROCEDURE :: secondsSinceEpoch PROCEDURE :: strftime PROCEDURE :: tm PROCEDURE :: tzOffset PROCEDURE :: utc PROCEDURE :: weekday PROCEDURE :: weekdayLong PROCEDURE :: weekdayShort PROCEDURE :: yearday

ENDTYPE datetime I thought this one was quite nice because it is easy to distinguish Fortran constructs from variables, making them overall more visible to the eye. However, over the past year or two during which I've been reading a lot of other people's code written mostly in lower-case, my eyes and brain got re-trained to read lowercase more easily, and I have found myself writing mostly lowercase for Fortran constructs and variables, while still employing camelCase where appropriate.

For the same reason I do not prefer all uppercase code, I also do not prefer exclusively lowercase code. Fortran syntax is not case-sensitive, and by intelligently employing uppercase characters we can improve the readability of the code.

— Reply to this email directly or view it on GitHub https://github.com/Fortran-FOSS-Programmers/Best_Practices/issues/5#issuecomment-170801531.

rouson commented 8 years ago

On Jan 11, 2016, at 9:19 PM, Stefano Zaghi notifications@github.com wrote: From one of the @cmacmackin https://github.com/cmacmackin guides https://github.com/Fortran-FOSS-Programmers/FIAT/wiki/Style-Guidelines#naming-conventions Names should be descriptive, especially for publicly visible entities. Underscores should be used to separate words if a name is more than two words long. If a name is only two words long then an underscore may be used at your discretion. Sometimes they make the name more readable, sometimes they just make it longer.

I don’t think longer is inherently a bad thing and in fact I’d say longer is usually good if it adds meaning. Local shortening is often possible with the ASSOCIATE construct. (The one place where I often use all caps is in emails because I don’t know of an email client that does syntax-aware coloring.)

Damian

milancurcic commented 8 years ago

Thank you Damian for all the great input. An example in which I believe a different style for procedures and variables may be helpful is when a reader does not know whether something like:

lb = field % lower_bound(n)

is a function call or a reference to an array element, due to these two operations having the same syntax.

Of course, we could encapsulate all the derived-type data and then it would have to be a function call! :)

szaghi commented 8 years ago

@rouson

I use mixed case in situations where underscores are not allowed, e.g., in defined operators:
type foo
contains
   procedure :: negate_foo
   generic :: operator(.negateFoo.) => negate_foo
end type
WoW! I was not aware of such limitation! Thank you very much!

@milancurcic I agree with Damian about uppercase convention: from the time I have an editor with syntax highlighting (obviously VIM, what else? :smile: ) I preferred to avoid uppercase. Indeed, my motivations are different from the Damian's one: I am quite lazy, thus if I can avoid to press shift or caps lock I follow this energy saving way.

Anyhow, I am going to update the first post of this issue with your suggestion.

szaghi commented 8 years ago

@milancurcic

An example in which I believe a different style for procedures and variables may be helpful is when a reader does not know whether something like:
lb = field % lower_bound(n)
is a function call or a reference to an array element, due to these two operations having the same syntax.

Wonderful example! From google group CLF there should be some suggestions on that, I will try to dig into it.

cmacmackin commented 8 years ago

My two cents (knowing that I would appear to have been over-ruled in at least one case, which is fine): I don't like using uppercase letters at all in Fortran. In language keywords this is simply because, like Stefano, I'm too lazy to bother holding down the shift key (plus it makes my little finger sore if I do it for too long). For identifiers, there is something I really dislike about using case to indicate different words within them when Fortran is case-insensitive. It just strikes me as untidy to rely on something which the language doesn't recognize. Just a little quirk of mine.

On 12/01/16 08:57, Stefano Zaghi wrote:

@milancurcic https://github.com/milancurcic
An example in which I believe a different style for procedures and
variables may be helpful is when a reader does not know whether
something like:

lb=  field%  lower_bound(n)

is a function call or a reference to an array element, due to
these two operations having the same syntax.
Wonderful example! From google group CLF there should be some suggestions on that, I will try to dig into it.

— Reply to this email directly or view it on GitHub https://github.com/Fortran-FOSS-Programmers/Best_Practices/issues/5#issuecomment-170843170.

Chris MacMackin cmacmackin.github.io http://cmacmackin.github.io

victorsndvg commented 8 years ago

Hi all,

I was reading the comments of this issue and I'm going to write my opinion in the following lines. Hope to be helpful.

(Sorry in advance for my english)

I'm agree with the previous post of @milancurcic , for me, camelCase for procedures is preferable. Procedure names must be meaningful actions (Robert C. Martin, Clean code book :+1: ), and some times I get very long names to express the intention and context for a procedure.

If I use underscore I finally have procedure names like:

subroutine perform_an_action_with_this_variable_of_type_XXX_while_this_context
end subroutine

Instead of:

subroutine PerformAnActionWithThisVariableOfTypeXXXWhileThisContext
end subroutine

Sometimes, to save a single character is enough to get a more expressive procedure name and avoid the following error:

Error: Name at (1) is too long

I also think that a good practice using TBP's is to put the name of the derived data type as a prefix of the procedure, it allows move code from a file to a different file without changing the procedure names, refactoring/renaming, copy/paste, etc. in an easy way. If I use underscore for TBP's I finally have procedure names like:

type :: my_object
contains
  procedure :: action => my_object_perform_an_action_with_this_variable_of_Type_XXX_while_this_context
end type
subroutine my_object_perform_an_action_with_this_variable_of_Type_XXX_while_this_context
end subroutine

Instead of:

type :: my_object
contains
  procedure :: action => my_object_PerformAnActionWithThisVariableOfTypeXXXWhileThisContext
end type
subroutine my_object_PerformAnActionWithThisVariableOfTypeXXXWhileThisContext
end subroutine

I think that both ways are readable, but underscore doesn't add any meaning to procedure names and ... I need every character! :laughing:

zbeekman commented 8 years ago

I agree with @victorsndvg that it is really important to ensure that the procedures that TBPs resolve to have very descriptive and informative names, although I think there is some movement towards smarter IDEs/editors knowing about TBPs. I think @rosenbrockc has done some work in this area, at least for Emacs.

I will just weigh in with my personal preference, from what I can remember/think of off the top of my head

I like camelCase for procedures. It is nice, especially for functions, to be able to immediately identify an entity as a procedure. I know it is a little bit crazy to use capitalization to convey info in the source when the language itself is blind to capitalization differences (as @cmacmackin and others have pointed out), but I've seen this convention for functions used in other languages, and like it.
Names must have meaning--I think we're all on the same page here. No sense shortening a name to save typing if it no longer carries meaning.
Procedure names MUST have an action verb in them
Named constants are in UPPERCASE. Again, I know this is somewhat bananas, when the language itself cannot differentiate between lower and upper case, but I find it nice to convey information that these variables are defined where they are declared, and are read only.
Logical variables must be named to reflect the state they are used to convey, most with the verb "to be":
- lib_is_initialized vs something like lib_init
- obj_has_parent vs obj_parent
Append _t to derived type names and _m to module names. This helps "namespace" these items so that you can have modules with effectively the same name as the class they implement, and one could even declare an object with the same name as the derived type/class if one wanted to do so. Also the _t is a common convention in some other languages like C.
I'm sure there are others, but this is all that comes to mind now.

Also, I just want to say, that I am not locked into these conventions, and would be more than happy to adapt to a sane alternative. While I have strong feelings (currently) that naming things following the convention I outlined above is doing it "the right way"^:tm: I am not so stubborn as to not be willing to change my mind or adopt a common practice for the sake of collaboration.

LadaF commented 8 years ago

Hi, I don't believe there can be any definitive true naming convention for Fortran. Consistency within a project is more important.

My 2 cents:

Basic scheme: undescores and not camelCase. Reason: consistency with the Fortran standard naming convention and the case insensitivity of names Fortran.
Structure constructors and other similar function should not have any verb in them. Their name is just what they return. Reason: Consistency with the Fortran standard default and overloaded structure constructors naming. This is similar to conventions in other languages (see "If the function returns the property of its first argument, omit the verb." in Cocoa guidelines).

I do not like any suffixes for module names. One could argue about their usefulness for derived type names. The reason is again the case insensitivity.

While in Java and C++ the correct way is:

  Sphere sphere;

in Fortran

  type(Sphere) sphere

won't work. For this reason I can imagine using the _t suffix although I don't use it myself and rather come up with a more descriptive name for the variable.

tclune commented 8 years ago

On Jan 12, 2016, at 9:40 AM, LadaF notifications@github.com wrote:

Hi, I don't believe there can be any definitive true naming convention for Fortran. Consistency within a project is more important.

My 2 cents:

Basic scheme: undescores and not camelCase. Reason: consistency with the Fortran standard naming convention and the case insensitivity of names Fortran.

I would like to emphasize this reasoning. About 10 years ago, I fell under the sway of a well-meaning software engineer that helped my group develop coding standards. He pushed hard for CamelCase, and I accepted it and even learned to like it. (My open source projects even get the occasional thumbs up for this choice.) But I’ve started to reverse this in my recent coding. It cannot be enforced in Fortran, and others have cited studies that show underscore improves legibility. (I don’t have the references, sorry.)

I’ve decided to keep CamelCase for module names and type names. At the very least this limited the amount of file renaming that I had to do to go with the style change, but it also has a certain logic. Type names are “proper nouns” in the language, and as such, I at least one to capitalize them. I find that CamelCase helps to accentuate this when the noun is multiple words. But more importantly it often solves another issue that is mentioned down below.

Structure constructors and other similar function should not have any verb in them. Their name is just what they return. Reason: Consistency with the Fortran standard default and overloaded structure constructors naming. This is similar to conventions in other languages (see "If the function returns the property of its first argument, omit the verb." in Cocoa guidelines https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/CodingGuidelines/Articles/NamingFunctions.html#//apple_ref/doc/uid/20001283-BAJGGCAD.

I do not like any suffixes for module names. One could argue about their usefulness for derived type names. The reason is again the case insensitivity.

While in Java and C++ the correct way is:

Sphere sphere; in Fortran

type(Sphere) sphere won't work. For this reason I can imagine using the _t suffix although I don't use it myself and rather come up with a more descriptive name for the variable.

Yes. This is particularly annoying. If my type is a single word, I’m forced to use an abbreviation (sphr) or something like “a_sphere” for the variable name. But with my camel case convention for type names and most types having multiple words, much of my code looks like:

type (SphereShape) :: sphere_shape ! or just sphere

Of course I often find that if I want my variable name to have the same name as the type (class), then I may not have given enough thought to how one or the other ought to actually be named. They are different categories of entities after all. But it seems largely unavoidable in short examples and such. Cheers,

Tom

— Reply to this email directly or view it on GitHub https://github.com/Fortran-FOSS-Programmers/Best_Practices/issues/5#issuecomment-170932003.

Thomas Clune, Ph. D. Thomas.L.Clune@nasa.gov Software Infrastructure Team Lead Global Modeling and Assimilation Office, Code 610.1 NASA GSFC
MS 610.1 B33-C128 Greenbelt, MD 20771 301-286-4635

certik commented 8 years ago

The guidelines I wrote up that @szaghi quoted above seem to be the common denominator.

As @LadaF, @tclune and others stressed, one should use lower case. That implies underscores, except short names of perhaps 2 words. The exception are mathematical symbols, perhaps an array variable A or spherical harmonics Ylm. Besides Fortran I also code a lot in C++ and Python, and they both essentially follow the same convention. Except classes, where SphereShape is usually recommended (definitely in Python, though in C++ people sometimes use lower case), and so I think that would be fine for Fortran classes/types.

The addMilliseconds style for variables comes from Java and I pretty strongly do not recommend it.

rouson commented 8 years ago

On Jan 12, 2016, at 4:12 AM, Izaak Beekman notifications@github.com wrote:

Names must have meaning--I think we're all on the same page here. No sense shortening a name to save typing if it no longer carries meaning. Procedure names MUST have an action verb in them I like this a lot. I just looked back through the codes I’ve written most recently (bash installation scripts) and see that I must have intuitively applied this rule without realizing it.

I would add that it it seems natural for result names to have noun in them as in the following:

function construct_foo_from_nothing() result(new_foo)

Named constants are in UPPERCASE. Again, I know this is somewhat bananas, when the language itself cannot differentiate between lower and upper case, but I find it nice to convey information that these variables are defined where they are declared, and are read only. I tend not to do this myself, but I don’t mind when others do it so I’m neutral on this one. Logical variables must be named to reflect the state they are used to convey, most with the verb "to be": lib_is_initialized vs something like lib_init obj_has_parent vs obj_parent I love this and again I think (hope) I’ve applied this intuitively without codifying it in a general rule.

Append _t to derived type names and _m to module names. This helps "namespace" these items so that you can have modules with effectively the same name as the class they implement, and one could even declare an object with the same name as the derived type/class if one wanted to do so. Also the _t is a common convention in some other languages like C. I’m not a huge fan of this approach for several reasons:

A single character has the lowest information entropy possible in a language. Probably for this reason, conveying meaning with a single character is bound to confuse anyone who is not already familiar with the convention.

I have no problem with borrowing from other languages, but there just aren’t enough Fortran programmers doing this for it to be widely a widely recognizable convention.

Damian

tclune commented 8 years ago

I meant to add earlier in this thread, that I too used to use the “t” suffix. Also flirted a bit with prefix “t”. I’ve ultimately found them superfluous, though I admit that for short type names it does help disambiguate from a variable name. (So avoid short names …)

I do however want my module to have the same name as the derived type that is defined in the module. My usual pattern is to define one “class” per Fortran module. But the language does not allow the module to have the same name as the type it provides. Here is the solution I currently use:

module pkg_SphereShape_mod … type :: SphereShape …

The prefix “pkg_” is used for all modules in the package. This helps to prevent namespace collisions when the package is brought into another project. I find it unnecessary to burden the type name itself with the pkg.

Unfortunately I’ve not yet back propagated this approach to pFUnit where it would probably help there more than most packages. But it is on my todo list.

I could probably drop the “_mod” suffix now that I’ve adopted the pkg prefix conventon. But I’ve used “_mod” for so long that I just don’t feel much pressure to make the change.

Tom

On Jan 12, 2016, at 10:18 AM, Damian Rouson notifications@github.com wrote:

On Jan 12, 2016, at 4:12 AM, Izaak Beekman notifications@github.com wrote:

Names must have meaning--I think we're all on the same page here. No sense shortening a name to save typing if it no longer carries meaning. Procedure names MUST have an action verb in them I like this a lot. I just looked back through the codes I’ve written most recently (bash installation scripts) and see that I must have intuitively applied this rule without realizing it.

I would add that it it seems natural for result names to have noun in them as in the following:

function construct_foo_from_nothing() result(new_foo)

Named constants are in UPPERCASE. Again, I know this is somewhat bananas, when the language itself cannot differentiate between lower and upper case, but I find it nice to convey information that these variables are defined where they are declared, and are read only. I tend not to do this myself, but I don’t mind when others do it so I’m neutral on this one. Logical variables must be named to reflect the state they are used to convey, most with the verb "to be": lib_is_initialized vs something like lib_init obj_has_parent vs obj_parent I love this and again I think (hope) I’ve applied this intuitively without codifying it in a general rule.

Append _t to derived type names and _m to module names. This helps "namespace" these items so that you can have modules with effectively the same name as the class they implement, and one could even declare an object with the same name as the derived type/class if one wanted to do so. Also the _t is a common convention in some other languages like C. I’m not a huge fan of this approach for several reasons:

A single character has the lowest information entropy possible in a language. Probably for this reason, conveying meaning with a single character is bound to confuse anyone who is not already familiar with the convention.

I have no problem with borrowing from other languages, but there just aren’t enough Fortran programmers doing this for it to be widely a widely recognizable convention.

Damian

— Reply to this email directly or view it on GitHub https://github.com/Fortran-FOSS-Programmers/Best_Practices/issues/5#issuecomment-170944362.

Thomas Clune, Ph. D. Thomas.L.Clune@nasa.gov Software Infrastructure Team Lead Global Modeling and Assimilation Office, Code 610.1 NASA GSFC
MS 610.1 B33-C128 Greenbelt, MD 20771 301-286-4635

rouson commented 8 years ago

On the issue of needing to distinguish module names from the names of other entities (notably types), wider access to submodule support recently inspired me to adopt a convention of appending _interface to module names and _implementation to submodule names. So far, I've only had one submodule per module and haven't thought through what I'll do when multiple submodules support a module.

FYI, for anyone who hasn't tried them yet, I highly recommend working submodules into your coding. The Fortran world is long overdue in addressing compilation cascades.

Because mentioning bleeding edge features invariably leads to discussions of availability and because I believe most people are unaware of how widely available some of the newest features are, I'll summarize the state of compilers here from memory. Please feel free to ignore the rest of this post if the status of compilers is not of interest.

I'm pretty certain that submodules are supported by recent versions of the following compilers:

Cray (a Fortran 2008 compiler),
Intel (so close to Fortran 2008 compliance that most people wouldn't notice what's missing)
IBM (essentially a Fortran 2008 compiler minus coarrays)
GNU (essentially a Fortran 2008 compiler minus PDT* and DTIO*)

And just for completeness, each of the above compilers supports the Fortran 2015 features for further interoperability with C and the Cray and GNU compilers support several Fortran 2015 additional parallelism features.

If anyone is interested in free access to submodules, but doesn't want to deal with the hassle of building gfortran 6.0.0 from source, it should in general be possible to download the OpenCoarrays build script, make it executable, and simply type

./build gcc trunk

or

./build gcc trunk /your/desired/installation/path

If you go with the latter approach, you will later be prompted for a password if the chosen path requires administrator privileges.

Damian

*parameterized derived types **derived type input/output

szaghi commented 8 years ago

Dear all,

I would like just to say thank you for your great help, it is very appreciated.

Maybe it was not clear, but I am trying to summarize all the best being into you into my leading post: as you insert your suggestions/opinions, I update my original thread summary. Thus, please check my summary and in the (probable) case I have made mistakes, advice me.

Cheers.

P.S. @rouson your post on compilers & co. is very appreciated, it will be grabbed and inserted into a dedicated issue as the other on CAF.

rouson commented 8 years ago

As a follow up to my last post, I'm going to go way out on a limb here and make a statement that is intentionally provocative:

I feel sufficiently strongly about submodules that I would include it as a requirement for any new Fortran development project to be considered "modern". Of course, I'm not suggesting that big, existing projects must be refactored, but everyone knows I'm big of design patterns and one of the two central mantras in the book that launched the field of object-oriented design patterns was "Program to an interface, not an implementation." There a several ways to follow that advice, but one way is to put only public interface information in the module and then put all private implementation details in submodules.

I can't think of many strong reasons not to make the above practice universal in new projects except possibly compiler support, but I would venture a wild guess that one or more of the four aforementioned compilers is the production compiler for 80% of Fortran programmers. Most of the remaining 20% are probably using the Portland Group compiler for its support of CUDA Fortran, especially on platforms such as Titan. If that's the case, I say speak loudly to Portland Group about working on Fortran 2008 support. They haven't been hearing much demand from users even though I'm certain there are many people who want Fortran 2008 features.

With that said, we could use a few compiler folks in this discussion.

szaghi commented 8 years ago

@rouson

I feel sufficiently strongly about submodules that I would include it as a requirement for any new Fortran development project to be considered "modern".

I agree that it is good guideline. However, I like to dedicate a separated thread to it, here I prefer to discuss on naming convention. By the way feel free to open any new issue you (meaning all of you, this English ambiguity puzzles me) like to discuss. But, just because you mentioned it :smile: , I would like to say that the reasons why I have not yet adopted it as a mantra are

my ignorance on how they works (expecially compilations cascade);
worries about compiler support

You (meaning Damian...) solved the latter, I have to study to solve the first.

tclune commented 8 years ago

Damian,

Could you post a demagogical submodule example for those of us that would rather learn that way than by reading a book?

Tom

On Jan 12, 2016, at 4:05 PM, Stefano Zaghi notifications@github.com wrote:

@rouson https://github.com/rouson I feel sufficiently strongly about submodules that I would include it as a requirement for any new Fortran development project to be considered "modern".

I agree that it is good guideline. However, I like to dedicate a separated thread to it, here I prefer to discuss on naming convention. By the way feel free to open any new issue you (meaning all of you, this English ambiguity puzzles me) like to discuss. But, just because you mentioned it , I would like to say that the reasons why I have not yet adopted it as a mantra are

my ignorance on how they works (expecially compilations cascade); worries about compiler support You (meaning Damian...) solved the latter, I have to study to solve the first.

— Reply to this email directly or view it on GitHub https://github.com/Fortran-FOSS-Programmers/Best_Practices/issues/5#issuecomment-171054044.

Thomas Clune, Ph. D. Thomas.L.Clune@nasa.gov Software Infrastructure Team Lead Global Modeling and Assimilation Office, Code 610.1 NASA GSFC
MS 610.1 B33-C128 Greenbelt, MD 20771 301-286-4635

rouson commented 8 years ago

@szaghi Possibly I should have made more clear how this relates to the naming convention. I believe that many of us have grappled with how to name our modules to distinguish them from entities inside the module. Submodules inspired a naming convention that eliminates this dilemma. I'm no longer tempted to write

module foo
   type foo ! disallowed due to name conflict
   end type
end module

because there is no longer a one-to-one relationship between the type and the scope that encapsulates the type. Now the type definition (which is part of the type's public interface) is in one scope (the module), whereas the type implementation is in another so there no longer remains a temptation to name the module something that the language doesn't allow. I hope this makes the connection to this thread more clear.

szaghi commented 8 years ago

@rouson :+1: wow, I does not consider this point of view!

szaghi commented 8 years ago

@rouson sorry, you now confuse me... type definition/implrmentation where? Please, consider to add at least a very small example :pray:

rouson commented 8 years ago

The following works with gfortran 6.0.0:

module speaker_interface
  implicit none

  type speaker
  contains
    procedure, nopass :: speak
  end type

  interface
    module subroutine speak 
    end subroutine
  end interface
end module

submodule (speaker_interface) speaker_implementation
contains

  subroutine speak
    print *,"Hello, world!"
  end subroutine

end submodule

program main
  use speaker_interface
  implicit none
  type(speaker) :: greeter

  call greeter%speak
end program

The naming convention makes clear that it's possible to change the implementation without having to recompile the main program because the main program only depends on the interface. This is what prevents compilation cascades where changing the internals of a procedure forces the recompilation of everything that depends on that procedure even if such recompilation is unnecessary because the only thin the dependent procedure really cares about is the interface information (for linking purposes), which might not have changed.

For many moons, I have been meaning to write a blog. I think I should make this the first one, given that I'm proposing that this become a ubiquitous practice.

szaghi commented 8 years ago

@rouson thanks a lot!

Two things:

in the case subroutine speack has a passed class the submodule should use module? I guess it does not because it is declared into the submodule of the module; if I guess right, the compilation cascade is submodule => module => program from lowest yo highest, right?
make a blog, make a blog, make a blog! :pray:

szaghi commented 8 years ago

Ok I think I am wrong about compilation cascade... it should be

module => submodule module => program Right?

rouson commented 8 years ago

I'm not sure I understand your notation. Let's say "=>" means "explicitly references". Then

program => module submodule => module

The program explicitly references the module and needs to be recompiled only if the module changes. Let's take an example. I believe it communicates more to the reader to explicit state the unit number rather than an asterisk so now I'll edit the subroutine implementation to read the following:

  subroutine speak
    use iso_fortran_env, only : output_unit
    write(output_unit,*) "Hello, world!"
  end subroutine

Because I have not changed the corresponding interface body (which is inside the interface block inside the module), I don't need to recompile the main program because it doesn't explicitly reference the submodule. The main program only references the module.

I should say there is also an implicit assumption that the module and its submodule(s) are in separate files. At least if one is using the make utility to build code, there would be no way of avoiding passing both the module and submodule to the compiler if they are in the same file. And I doubt a compiler exists that would figure out that the module doesn't need to be recompiled (e.g., by checking the corresponding .mod file) if the module and submodule are in the same file. The module and submodule will need to be in separate files to avoid unnecessary recompilation.

muellermichel commented 8 years ago

@rouson I'd just like to mention that PGI is today the dedicated Fortran compiler with Nvidia GPU support - both for CUDA Fortran and OpenACC there is AFAIK still no viable alternative. It could become even more widespread now that it is free to use on one node and can be downloaded directly from Nvidia. My point is, I'd advocate against using language features for now that are not supported by relatively widespread compilers that cannot be replaced for their usecase. IMO we should work toward Fortran code being as portable as possible on today's common HPC architectures, and GPUs are one of them. I fully agree that Fortran 2008 features should be demanded more from PGI and I will address this in their support forum so it gets on their radar.

rouson commented 8 years ago

@muellermichel Thanks for your input and especially thanks for agreeing to address this in their support forum. It would be great if other PGI users would do the same. The Fortran community is a very quiet community in some respects -- compiler vendors frequently comment to me that they aren't hearing request for the new standard features and yet I know from the classes that I teach that people are very interested in the new features (although I'm sure there's a self-selection process wherein those who are interested in the new features take my classes).

muellermichel commented 8 years ago

@rouson Would it make sense to update the following table: http://fortranwiki.org/fortran/show/Fortran+2008+status ? It shows submodule support only for Cray.

muellermichel commented 8 years ago

I just opened this thread on the PGI forum.

rouson commented 8 years ago

@muellermichel Yes. In fact, it would be very helpful to the community if this could be updated. In a proposal written last year, my collaborators and I encountered a reviewer who seemed unaware of the progress gfortran has made and it negatively impacted the review because it seemed the we were talking about features that weren't yet available even though they actually were.

The listed gfortran version 4.8 is now roughly 2.5 years old and lots of great things have happened since then. I won't have time to scrutinize the list but I recommend checking out the gfortran Fortran 2003 status and Fortran 2008 status pages. I think this pages are pretty up-to-date, but you might also email the gfortran developers mailing list at fortran@gcc.gnu.org.

For other compilers, I recommend consulting the aforementioned survey that Ian Chivers and Jane Sleightholme publish in nearly every issue of ACM Fortran Forum. The latest version was just published last month. If you don't have access to ACM Fortran Forum, I recommend contacting Ian and Jane. I think they can be reached via the email address listed on the FortranPlus web site.

rouson commented 8 years ago

@muellermichel, you rock!

muellermichel commented 8 years ago

@rouson Thanks, please also see the response by Mat Colgrove and my reply. I think it would make sense if you and maybe others could chime in with your opinions on this, i.e. why this feature is important to you.

nncarlson commented 8 years ago

This discussion is moving way too fast for me to keep up with it, let alone contribute in a timely manner. But there is one thing that caught my eye that I feel compelled to push back on, and that is the suggestion that gfortran is pretty close to being a full 2008 compiler. It may look like that if you read their 2003/2008 status pages, but the reality is that some of the newer features are plagued with significant bugs. Top of my list, and the thing that keeps me from using gfortran at all, is deferred length allocatable character variables -- arguable one on the most useful features of 2003 (maybe that's a bit of an overstatement :-). Support for that feature is claimed starting in 4.9 I think, but in fact it has never worked at all except in the most trivial uses. That's still the case with the current 6.0 trunk as of a couple weeks ago. Bugs with finalization is another serious issue for me. I don't mean to dis the people working on gfortran; I applaud their efforts and would love to see a solid free Fortran 2008 compiler. I just think we need to be realistic and honest about what its current state really is.

rouson commented 8 years ago

@nncarlson, These are excellent points and a welcome perspective. As @tclune pointed out to me, one's view of compiler is often heavily influenced by the frequency with which one uses it and (inevitably if it's a Fortran compiler) the frequency with which one reports bugs. My experience of gfortran (including my experience of its deferred-length character support) is heavily influenced by the fact that I've contributed a lot of bug reports and have been fortunate to come up with some funding to cover some of my most important bug fixes and feature requests. Probably for that reason, much of what I want is supported. As an aside, the first time I funded gfortran development was when I was at Sandia National Laboratories and @tclune has authorized some NASA funding for targeted bug fixes on occasion. Where there is a will, there is a way. It's unfortunate, however, that most organizations will cover the costs of a commercial compiler license but don't have a mechanism to cover any of the maintenance and development cost of an open-source compiler. I think most gfortran developers prefer to work as volunteers, but I'm certain that some of them would also accept funding and would target the features and bug fixes that the funding supports.

With all that said, the good news is that gfortran has made very recent strides on deferred-length character support, including fixes for a bevy of bugs just three days ago. I encourage you to build the current gfortran development trunk (see my earlier post for a script that automates the checkout, build, and installation process) and report any bugs you find.

As much as I love deferred-length characters, I personally wouldn't describe them as a major feature of Fortran 2008, but this is just my own definition of "major" and I have no hard, objective, or quantitative definition of the word so I'd still say that the only two major Fortran 2008 features that gfortarn is missing are PDT's and DTIO.

rouson commented 8 years ago

@muellermichel, I don't see a post by Mat Cosgrove.

muellermichel commented 8 years ago

@rouson I mean on the PGI thread. Screenname mkcolg. He's the main supporter for that compiler.

nncarlson commented 8 years ago

@rouson, I'll have to check out the development trunk again with fingers crossed that this long-standing issue has finally been fixed. In my defense I will say that my difficulties with gfortran are not for lack of reporting bugs. I've reported many bugs over the years with substantial effort invested in providing tiny minimal reproducers. (This is something that I encourage all my developers to do with any compiler -- things won't improve if problems aren't reported.) But my experience with gfortran has been different than yours (though I've never included $). Reports either sit ignored, or are immediately marked as duplicates of other reports with no serious examination and the reproducers effectively ignored.

Why do I find deferred length characters so damn useful? (I didn't say "major") I have a character component of a derived type, or a character variable to pass to a subroutine to receive a message. How long should I declare them to be? 8, 16, 100, 1000? How long is long enough? Now I have data to stuff into such a variable. What do I do if it's not long enough? Silently truncate, raise an error? This has been a thorn in my side for as long as I've been coding in Fortran (a long time). The answer to "how long" is deferred-length -- they are exactly as long as they need to be. All kinds of issues with fixed length characters immediately vanish. Having experienced the joy of this feature, I'm not giving it up :-)

rouson commented 8 years ago

@muellermichel, I see. I have a PGI license but don't have a login for the support forum and might not have time to create one in the short term plus all the compiler teams are used to hearing from me so adding my voice wouldn't matter much. I hope you'll mention to them that four other compilers support submodules so waiting "several years" will be put them extremely far behind the pack and their slowness hurts the whole Fortran community for the reasons you state. I think Cray has been Fortarn 2008 compliant since at least 2014 and even gfortran's (pre-release) submodule support is approaching six months old (the first commit was in early August). It can take months to add a significant new feature so if PGI hasn't even started on preliminary support for the feature, then the time it will take them to finally catch up is substantial.

I hope others will chime in, but it's a chicken-and-egg problem. Most people won't adopt a feature until there is widespread support, but the vendors don't see a market demand that justifies support until more people force their hand -- hence Mat's question to yo regarding how important the feature is to you.

It's so frustrating and depressing...

muellermichel commented 8 years ago

Please note: AFAIK anyone can create a PGI Forum user, no license needed.

rouson commented 8 years ago

@nncarlson, I hear you and feel your pain. I'm really glad to hear you've reported bugs. I started the Ad Hoc project to catalog the bugs that matter most to me and to those with whom I collaborate. I'd definitely put deferred-length characters high on my wish list so a contribution of a bug reproducer in the form of a pull request for Ad Hoc would be welcome.

I've experienced both ends of the spectrum: some of my gfortran bug reports have languished for years and many others have been fixed in 24-48 hours. Averaging it out across all bug reports I've submitted on six different compilers, NAG and gfortran have had the fasted average turnaround time, but the average is probably skewed by the funding in a fraction of the cases for gfortran.

szaghi commented 8 years ago

@rouson @nncarlson I agree with both of you (is it possible? :smile: ). I also experienced frustration with GNU gfortran support on deferred length characters, but currently I have success, e.g. into FLAP. I am not conscious why (the exact reason) FLAP starts working with GNU gfortran, but @victorsndvg should know: he was the first that commits a workaround on a do loop that solved the FLAP issue on deferred length strings with GNU gfortran (maybe I am wrong, but I guess something related to concatenation into loop).

FYI I am going to report my first possible bug on GNU gfortran. I have already a minimal working example, can I put under you eyes before submitting (it is highly probable that it is not a bug, but my mistakes)?

LadaF commented 8 years ago

I also agree with the usefulness of submodules, but there is still a huge sector of computers where it is tricky to install new versions of compilers on your own. Especially the middle class of supercomputers is tricky. The largest ones have very good support and not the latest, but recent enough compilers (no way gcc 6, but at least Cray and Intel 15 or 16). But those medium ones in smaller institutions often have less professional support and old compilers. Now you can install your own, but that is not the end of the story. You also need to compile the MPI library for the new compiler. For that you need the drivers and headers for the local Infiniband or other interconnect and the headers for the queue system integration. Because of the not so professional support, it is sometimes pretty difficult to get these. Sometimes you can get away with just somehow recompiling the Fortran MPI library modules and use the old version otherwise, but it is risky.

szaghi commented 8 years ago

@LadaF we are in the same boat!

cmacmackin commented 8 years ago

The other issue I see around submodules is that they dramatically increase the verbosity of the code. One of the things I always considered of an advantage of Fortran over C or C++ is that you don't need to mess around with header files. Given that Fortran is already quite a verbose language, I see this as a major downside. Not to deny that there are situations in which submodules are appropriate and I do concede the usefulness of separating interface and implementation, but I would be hesitant to make their use a general rule, at least not for smaller projects where recompilation cascades are not a big issue.

On 13/01/16 12:36, Stefano Zaghi wrote:

@LadaF https://github.com/LadaF we are in the same boat!

— Reply to this email directly or view it on GitHub https://github.com/Fortran-FOSS-Programmers/Best_Practices/issues/5#issuecomment-171278533.

Chris MacMackin cmacmackin.github.io http://cmacmackin.github.io

szaghi commented 8 years ago

@cmacmackin Good point. I am going to try to make a compromise summary of all the above opinions: I will place many of them into a consider also that... subsection, while the main naming convention guideline should be very neutral, merging all our maximum common denominators. I hope the resulting guideline will be neutral, but comprehensive (via subsection considerations) and sill concise: I am an incurable optimistic person.

milancurcic commented 8 years ago

How about this convention: Never use variable name l for indexing an array or as a procedure argument? :)

szaghi commented 8 years ago

@milancurcic oh, yes wonderful! Feel free (all of you) to modify my incomplete and ugly draft!

cmacmackin commented 8 years ago

Just a thought which occurred to me: while in general I agree that procedures should contain action verbs, I can think of two exceptions where this might not be necessary:

1) Accessor methods. I know that often the get_ suffix is provided, but I'm not sure if this is really necessary, at least not in all cases. Consider a linked list, which has a method to return the number of elements in the list. I think in that case size would be a perfectly acceptable name and don't really see the point of calling it get_size. If we want to access a private variable, then I'd be tempted to give the private variable's name a trailing underscore or something rather than have to add get to the accessor. For example:

type :: example
  integer, private :: component_
contains
  procedure :: component => get_component
end type

2) Methods to return a logical, representing the state of a derived type. Just follow the same naming rules as for a logical variable. For example, if you have a method on an iterator to check whether there is another element, just call it has_next, not something like check_has_next.

LadaF commented 8 years ago

@milancurcic Why that one? The sequence is quite natural i,j,k,l and the risk of confusion is not larger than between i and j. I use it quite often.

Fortran-FOSS-Programmers / Best_Practices

Naming Conventions, which one? #5

From Richard Maine (stated here)

From one of the @cmacmackin guides

From @certik best practices

From @milancurcic best practices

From @rouson best practices

From @victorsndvg best practices

From @zbeekman best practices

From @LadaF best practices

From @tclune best practices