j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
178 stars 15 forks source link

“Physical” or “Engineering” Units of Measure #50

Open certik opened 4 years ago

certik commented 4 years ago

Extracted from #49. Author: Van Snyder

Introduction

The proposal for which authorization to proceed as an ISO Technical Specification was requested in June 2016 was N2113.

Incorrect use of physical units is a common error in scientific or engineering software. Other common errors are mismatching the types of actual and dummy arguments, and subscript bound violations. Explicit interfaces largely solve the latter problems, but do nothing directly for the former. (One can use derived types to provide a physical units system, at the expense of redefining intrinsic functions, operations, and assignment, for all combinations of units – a tremendous job for mechanics, saying nothing about thermodynamics, electronics, ... – and then you hope for inlining. If done using type parameters or integer components, it can distinguish length from time, but not kilograms from pounds.) A particularly expensive and embarrassing example was the loss of the Mars Climate Orbiter. The loss resulted because the NASA contract required small forces, e.g. from attitude-control maneuvers, to be reported in Newton-Seconds, but Lockheed nonetheless reported them in Pound-Seconds. (This was quite inscrutable, as Lockheed had had NASA contracts for over thirty years, and they always specified SI units.)

Proposal

Define a new UNIT or MEASURE attribute or type parameter (call it what you will) that can be specified for any numeric variable or named constant. Literal constants are unitless.

Define multiplication and division operations on units. Exponentiation by an integer constant could be defined to be equivalent to multiplication or division. Square root, or maybe even exponentiation of a unit by a rational number, would be useful. In the context of a unit definition, the integer literal constant 1 is considered to be the unitless unit.

Each unit declaration creates one or more generic units conversion functions having the same name as the unit, that takes an argument with any related unit, and converts it to have units specified by the name of the function. It also creates a function that casts unitless values to have the specified unit. There is an intrinsic UNITLESS conversion function.

Quantities can be added, subtracted, assigned, compared by relational operators, or argument associated only if they have equivalent units. Atomic units, i.e. units that are not defined in terms of other units, are equivalent by name. Other units are equivalent by structure.

When quantities are added or subtracted, the units of the result are the same as the units of the operands. When quantities are multiplied or divided, the units of the result are the units that result from applying the operation to the operands’ units. Multiplication or division by a unitless operand produces a result having the same units as the other operand.

Units participate in generic resolution.

Procedure arguments and function results can have abstract units. This allows enforcing a particular relationship between the units, without requiring particular units. For example, the SQRT intrinsic function result has abstract units A, and its argument has abstract units A*A. Abstract units do not participate in generic resolution.

Define an intrinsic RADIAN unit, and a parallel set of generic intrinsic trigonometric functions that take RADIAN arguments and produce unitless results. All of the remaining intrinsic procedures have arguments with abstract units and results that are unitless (e.g. SELECTED - INT KIND) or have the same units as their argument (e.g. TINY). Because function results do not participate in generic resolution, it is not possible to have a parallel set of generic intrinsic inverse trigonometric functions that return RADIAN results. It may be useful to provide an intrinsic module that has some public units and procedures, e.g. units TICK and SECOND and a SYSTEM CLOCK module procedure that has arguments with units TICK, TICK/SECOND and SECOND.

Variables are declared to have units by specifying UNIT(unit-name) as an attribute in their declarations, or alternatively by separate UNIT declaration statements.

Examples:

UNIT :: INCH, SECOND
UNIT :: CM, INCH = 2.54 * CM
UNIT :: CM_PER_INCH = CM / INCH
REAL, PARAMETER, UNIT(CM_PER_INCH) :: CONVERT = 2.54
UNIT :: SQINCH = INCH * INCH ! or INCH ** 2
UNIT :: IPS = INCH / SECOND, FREQUENCY = 1 / SECOND ! or SECOND ** (-1)
REAL, UNIT(SQINCH) :: A
REAL, UNIT(FREQUENCY) :: F
REAL, UNIT(INCH) :: L, L2
REAL, UNIT(CM) :: C
REAL, UNIT(SECOND) :: T
REAL, UNIT(IPS) :: V

V = A + L              ! INVALID -- SQINCH cannot be added to INCH,
                       ! and neither one can be assigned to IPS
V = IPS(A + SQINCH(L)) ! VALID -- I’m screwing this up intentionally
V = (A / L + L2) / T   ! VALID -- IPS is compatible with INCH / SECOND
A = L * L2             ! VALID -- SQINCH is compatible with INCH * INCH
F = V / L              ! VALID -- units of RHS are 1/SECOND
C = CONVERT * L        ! VALID -- CM / INCH * INCH = CM
C = CM(L)              ! VALID -- Clearer than the previous statement
L = SQRT(A) * 5.0e-3   ! VALID -- exercise for reader

Full Proposal

A full proposal, cast as a TS, by Van Snyder: Units-TR-19.pdf

jacobwilliams commented 4 years ago

I haven't studied the details of this proposal, but I know that built-in units would be an amazing feature for engineering applications.

As an aside, I think the only software I've ever used that had this was Mathcad, which in school was incredible for doing engineering homework problems. I do think the committee should certainly be thinking about how to get students to want to use Fortran, rather than only focusing on HPC people. This could certainly be one feature that could do that.

sblionel commented 4 years ago

Van's proposal has been repeatedly and thoroughly considered by WG5 and found wanting. It didn't help that "units" would not have avoided the problem that Van kept offering as an example of why it would be useful.

I would have to say that this proposal has been poisoned, and is unlikely to gain any traction. At the very least, WG5 (and not just J3) wanted to see a trial implementation using derived types and defined operators to see how it would work in real life. That never happened despite multiple requests over years.

jacobwilliams commented 4 years ago

See also from @szaghi: https://github.com/szaghi/FURY

certik commented 4 years ago

@sblionel this proposal is a prime example why we need to discuss things publicly (for example here). The committee has rejected this repeatedly before my time on the committee, and I can't find any technical discussion why it was rejected. That is very inefficient. Let's use this opportunity to document the arguments.

It didn't help that "units" would not have avoided the problem that Van kept offering as an example of why it would be useful.

It looks like based on the article you sent that a lot more went wrong, but it did start with the wrong units. Given that this example is given in the motivation of this proposal, I would like to know the answer to these questions:

Was the navigation system written in Fortran?

If Fortran had units support and units were used in the navigation system, would it actually catch this particular error?

(I can easily imagine that it wouldn't, if the two modules communicated in some way that the units, as proposed in this proposal, would not actually catch it.)

At the very least, WG5 (and not just J3) wanted to see a trial implementation using derived types and defined operators to see how it would work in real life.

It looks like that's been done: https://github.com/szaghi/FURY, as @jacobwilliams mentioned.

However, I would prefer this to be part of the language, determined at compile time (with no runtime overhead), not implemented using derived types, checked at runtime.

certik commented 4 years ago

I would have to say that this proposal has been poisoned, and is unlikely to gain any traction.

I believe very strongly that no Fortran feature should be "poisoned" (by a single person?) that we cannot even discuss it. I strongly suggest we discuss features on their technical merits and document arguments for and against. Being "poisoned" is not a technical argument.

However, tt is useful to know that the committee repeatedly discussed this and that they might be "allergic" to seeing this again. And so if we are going to bring this up again, we better have solid arguments and address every single objection that the committee had. (It does not help that the committee did not write down the objections, but let's fix that from now on by documenting all objections to a given proposal.)

everythingfunctional commented 4 years ago

I actually have an implementation of this here.

acferrad commented 4 years ago

Looks good. A few things I noticed in the conversion_factors_m file.:

  1. The exact SI - Imperial mass conversion factor = 0.45359237 kg/lb, which should be the base value, not g/oz. The latter is exactly 28.349523125.
  2. The energy conversion between Joules and calories are non-consistent with BTU and Joule: there are 2 main definitions for BTU and calorie: IT (the standard) and Th (Thermochemical). The units for J/cal(IT) = 4.1868 and for BTU(IT)/J = 1055.056. The code has Th for one and IT for the other. (ref: https://www.nist.gov/pml/special-publication-811/nist-guide-si-appendix-b-conversion-factors/nist-guide-si-appendix-b8 )

On Fri, Dec 27, 2019 at 1:23 PM Brad Richardson notifications@github.com wrote:

I actually have an implementation of this here https://gitlab.com/everythingfunctional/quaff.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/50?email_source=notifications&email_token=AJIYC2AHQVA7ESOEHPHZYHLQ2ZI35A5CNFSM4JFTRXSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHXUVIY#issuecomment-569330339, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJIYC2H644MNQORQWP6VHPDQ2ZI35ANCNFSM4JFTRXSA .

everythingfunctional commented 4 years ago

Thanks @acferrad . I had trouble choosing what to use as the base for mass, because it seemed when I looked that both conversion factors were published by NIST and were inconsistent. Similar thing with calories and BTU, multiple, inconsistent conversion factors.

klausler commented 4 years ago

I think that the important aspect of this feature is the language support for having a concept of units as an attribute of types (just REAL?), having the compiler combine units correctly on multiplication and division and (some) powers, and check them on addition, subtraction, assignment, argument association, &c. The actual names of the units and their conversion factors could come from intrinsic or library modules, and in the latter case, the Fortran language would not be responsible for determining their values.

nncarlson commented 4 years ago

The actual names of the units and their conversion factors could come from intrinsic or library modules, and in the latter case, the Fortran language would not be responsible for determining their values.

Definitely should be the latter. The standard has no business mandating a set of names or conversion factors. Simply providing an abstract unit capability should be the goal.

acferrad commented 4 years ago

Definitely should be the latter. The standard has no business mandating a set of names or conversion factors. Simply providing an abstract unit capability should be the goal.

Totally agree. Therefore, in this implementation of what would become a Fortran standard, the names and constants should be removed from the project, and stored in an external file or project. The Fortran standard would then need to have some way of accessing it.

everythingfunctional commented 4 years ago

I don't think this is something that should be part of the language. I agree that it is something that should be done and used in new code wherever possible, just not as part of the language. I've seen several libraries trying to deal with units, and they all take different approaches to dealing with what are not entirely orthogonal or entirely compatible problems. Those being:

I've taken what I think is a pretty good shot at creating a library, as I mentioned above, but I think even that reveals some bugs, inconsistencies or missing features that would likely show up in trying to put something like this in to the standard. Not the least of which being that conversion factors between units aren't always simple. Think Celsius to Fahrenheit.

I don't think I've seen a sufficiently self consistent, bug free, feature complete way of dealing with units in any language. And so I would not recommend trying to make it part of the Fortran standard.

klausler commented 4 years ago

I don't think I've seen a sufficiently self consistent, bug free, feature complete way of dealing with units in any language. And so I would not recommend trying to make it part of the Fortran standard.

The implementation of units on the HP-48 programmable calculator is worth studying.

certik commented 4 years ago

@everythingfunctional I agree that something like this requires a complete implementation first, and real usage before even thinking of putting this into the standard. I looked at your implementation and it seems that creating values with units is as easy as 1.0d0.unit.METERS. But how do you actually use it, say an array of quantities? Can you add some documentation and post here? Does it require to define the array of derived types? Or just reals. I read through your tests, but it is still not clear. Looking at one such test: https://gitlab.com/everythingfunctional/quaff/blob/b515ae79fa0d68d44d25aefdc103234a760a5d54/tests/speed_test.f90#L31, it seems it requires to define density and density_?

The advantage of figuring out how to put this into the language itself is that one might define just regular arrays of real numbers (i.e., no derived types) and assign units to it with some syntax, and the compiler will check the units at compile time with no runtime overhead, and the natural already familiar syntax.

I agree this would require a prior compiler implementation and usage in real codes to see if this is worth putting into the language.

everythingfunctional commented 4 years ago

@certik , each quantity is it's own type. So the expression 1.0d0.unit.METERS is of type Length_t. And thus, if you wanted an array of densities, you would define one like type(Density_t) :: densities(5). It's possible add and subtract quantities of the same type, multiple and divide a quantity by a scalar, and dividing quantities of the same type returns a double precision. Quantities of the same type can also be compared with the normal operators. Additionally, each of the functions to divide or multiply different quantities is implemented specifically to return the correct resulting quantity. I.e. (1.0d0.unit.METERS) / (1.0d0.unit.SECONDS) == (1.0d0.unit.METERS_PER_SECOND) is valid and true. Additionally, all of the procedures are elemental, so you can do matrix math as well.

There is clearly some run time cost associated with this approach (about 3x cost to multiply 2 numbers as shown by my tests). But I find that to be acceptable in most instances given the added safety and convenience it brings. And if you're doing really performance sensitive stuff, the internal values are stored in SI units and are public so you can mitigate the slow down even further if need be.

With full compile time checking I'm not sure you could make this valid and extensible with new units.

lengths(1:3) = [1.0d0.units.METERS, 1.0d0.units.FEET, 1.0d0.units.MICROMETERS]

But it works perfectly with my library. And it's possible to add new units and even quantities without actually needing to make any changes to the original library. If you want to use furlongs per fortnite for speed in your code, you just need to define a conversion factor from meters and a symbol somewhere in your code and it will work just fine.

magicmouse commented 3 years ago

Mr. Snyder's specification has proved very useful, as of 2021 i am aware of various languages that fully support physical units of measure. At compile time only there is F# and Cambridge FORTRAN. For a full implementation, including user defined units of measure, and carrying the exponents of the fundamental quantities at runtime, you have Frink language (www.frinklang.org), and Beads (www.beadslang.org). Frink is the supreme king of Units. Beads is a general purpose graphical interactive language that emits web apps, etc., and includes a layout model and event management in the language.

urbanjost commented 3 years ago

pre-f90 working with a large number of projects scattered around the world we found almost all the unit issues were bound with input and output of values (including graphics) and basically make the Unix units(1) command into a subroutine and created a standard portable self-describing file format that required a unit code, and a routine called relate(3f) that would do y=mx+b conversions along with labels for each defined unit that was used as a default for labeling, particularly on plots. So a user could specify the units he wanted a plot in and everything would be automatically converted, for example. There was a built-in relate default table of a few thousand entries, and a "blessed" file with several thousand more relationships. We found in our case most unit-related issues were not in the codes doing computation so much as the routines delivering the data between programs from different departments and labs. We often wished there was something built into the language (almost all Fortran, some Ada PL/1, pascal,...) What we found in our particular case more of an issue at the coding level was uncertainty and accumlated floating point error issues; I remember the consensus was that would still be solved by hardware support at the processor level -- what ever happened to that? Think it would be great if Fortran supported units whether by types or casting or ... but that does not necessarily solve the primary issues we saw that involved input and output. Would these solutions provide some kind of labeling of data? I am glad XML was not around in the scientific world at the time or we might have tried that.

vansnyder commented 3 years ago

I disagree with Steve (@sblionel) that the units proposal has been "repeatedly and thoroughly considered." I gave a brief presentation at the 2005 Delft meeting. Malcolm and Lawrie made some comments that resulted in changes. The consensus was that this was a reasonable idea but couldn't be fit into the development schedule. I tried to bring it up for discussion at later meetings but there was no discussion. I tried to elicit discussion in e-mail but that also did not happen. I proposed to develop a non-binding TS, but WG5 voted against submitting an essentially-completed document for publication, and refused to discuss reasons for that rejection. No one has provided concrete technical objections to the proposal. No one has pointed to a section or paragraph and remarked "this cannot work" or "this cannot be done." The objections have been the difficulty to fit it into the development schedule, or the effort required by vendors to implement it. I believe the effort, on either score, would be significantly less than coarrays or interop.

At meeting 167, where proposals for 2008 were initially discussed, it was proposal 04-122. In 04-265r1.xls, the "hate dislike like love" score was 0-3-7-1. There was no technical discussion in plenary or subgroup. Nobody who cast a "dislike" vote offered a reason.

Another objection was "nobody has asked us for it." Of course, this was in response to me asking for it. I wasn't just asking for myself. I was representing more than 600 Fortran users at JPL, many of whom had admitted to having had mistakes in their codes that the proposed units system would have caught. The average cost estimate was two to three work weeks per year, but only one catastrophically expensive loss.

To answer some of Ondrej's questions:

The navigation software was in Fortran -- roughly six million lines spanning about 65 programs, being maintained at a cost of 6.5 work-years per year. People who didn't know anything about Fortran insisted in about 1996 that the entire suite of navigation and trajectory planning software had to be re-written in C++ and Python, which was done at great expense -- more than 240 work years. As late as the Mars Phoenix landing in 2008, it still didn't work.

For Mars Climate Orbiter, small-force data from Lockheed was ingested from text files. Contrary to Steve's assertion that the proposal would not have caught the problem that resulted in that catastrophic $300 million loss, that is precisely the reason that the proposal includes units in formatted I/O. It's possible that if units had been supported in the language, and used in the software, Lockheed software engineers could have "lied" in their output, claiming the units were Newton seconds when they were in fact pound seconds. But they would have had to make a conscious (or boneheadedly stupid) effort to do so. It's true that a compiler could not have caught the inconsistency because it wouldn't see both codes. But within a single code, the proposal is designed so that a processor can verify units consistency, including across procedure interfaces and between modules, and convert between related measurements based upon the same fundamental units.

A scheme using derived types, such as Grant Petty's scheme, has high runtime overhead, and requires significantly more labor than the method I proposed.

Fortran's niche for decades has been engineering and scientific applications, with a significant financial community thrown in. All these domains would benefit from units support. I don't see cell-phone apps or video games or OS kernels or MySQL or web pages getting any benefit from it. So the argument "other languages don't offer it" doesn't carry much weight.

certik commented 3 years ago

@vansnyder thanks for this excellent comment.

The big problem as I saw was that the committee and community was not archiving any kind of feedback and discussion of the proposals. As such, for somebody like me joining the committee in 2019, I can't see any discussion about your units proposal.

But, I think we have greatly improved the process since then. Your "units" proposal is getting all kinds of discussion now, in this very issue and also in this Discourse thread:

and at least @klausler and I are both interested in implementing this in a compiler, so that people can test this out. Yes, it should be in the language and checked by the compiler at compile time. (The IO would have to be checked at runtime, but I think it could be a quick check, depending on how it is read in, and presumably the IO is usually not the performance bottleneck -- we can brainstorm this later.)

So far I like the idea a lot and I think it should be pursued by prototyping in a compiler and the community should play with that. And actually discuss the pros and cons of including this.

I reserve the right to change my mind as more details are developed. As we should all --- after prototyping and user experience, if we find this is not a good idea after all, that the pros do not outweigh the cost of this feature, then let's not do that.

Here are the costs that every feature to Fortran (including this one) should be considered against:

magicmouse commented 3 years ago

One of the problems with established languages is that it is almost impossible to get agreement on sensible extensions to the language, and vast amounts of time pass by while it is laboriously discussed. We now have 2 languages (Frink and Beads) with physical units of measure both at compile time and run-time. The advantage of runtime units is that you can store a quantity the HD and read it in and still know the exponents. One can also verify both at compile time and run time that you are mixing proper units and not combining exponents inappropriately.

Interestingly, when I announced the feature of units of measure on Reddit in the scientific subgroup, it was met with almost sadistic levels of scorn, by scientists who snarled that they don't make units mistakes. The engineers were similarly displeased. I concluded from this that there is a fair amount of professional snobbery and arrogance about dumb mistakes that they don't want to admit making. We all know about the errors people make in spreadsheets costing millions. As far as i can tell human beings get up in the morning and make a lot of mistakes.

My point is that there are human emotional reasons for blocking this very sensible proposal to add Units of Measure.

Even in Mobile Apps which i have quite a bit of experience (over 100 apps), there is fairly regular use of time which has units of months, days, weeks, hours, etc. and in commerce one sees "dozen" and "gross" a fair amount. And angular measurements are also super common (degrees / radians). Avoiding units errors is just one of many protections against human error, and people should be more open to reducing the total volume of the envelope of possible errors.

And by the way, Mr. Snyder, you can run Beads under Linux by using Wine, the freeware windows EXE runner.

sblionel commented 3 years ago

My observations on the history of Van's proposal... J3 and WG5 repeatedly discussed the UNITS proposal over the years. The most recent was the 2013 Delft meeting where we spent an hour or more reviewing the proposal. My notes say: "Straw vote on whether to ask SC22 to authorize work item for units TS. 3/7/1 abstain US abstain, UK No, JP No, NL abstain, Canada No, Germany abstain  Not going to do it."

A big problem with a TS is the presumption that it will be part of the next standard revision. I and others did not like the specific proposal, which to me was very "F77-like". I suggested to Van that he should create a trial implementation using derived types and then see how it works when used in an application. He didn't want to do that. Another suggestion was to fork gfortran and add the feature, but that has a steep learning barrier even for an experienced compiler developer, which Van is not.

Personally, I don't have an objection to the concept, but as a former compiler developer I can see how this would be a lot of work for something I expect few programmers would use. You can do almost everything desired with derived types and defined operators, and I would really like to see someone model this as such before asking WG5 to add it to the language. If one of the OSS compilers wants to try it, great!

fluidnumerics-joe commented 3 years ago

I'm curious to understand what the objection is to implementing units handling and file IO in a library rather than implementing at the compiler level. Why doesn't a cross-organization group form to make something viable that doesn't require a change in the Fortran standard. IMO, this seems to be the path of least resistance, unless there are other barriers to getting something like this done and well supported (e.g. funding).

From this conversation, the "units as a standard" request is coming from and supported by NASA and DoD contractors but doesn't appear to have support beyond these kinds of groups. While I recognize tracking units correctly in code and in file IO is important and currently error prone, perhaps its time we rethink our software development strategy, rather than modifying a standard that impacts a broader community.

certik commented 3 years ago

and I would really like to see someone model this as such before asking WG5 to add it to the language. If one of the OSS compilers wants to try it, great!

Yes, we are in agreement on that. I think this should be prototyped first before bringing it to WG5 or J3. (Note that I would go beyond that and I think every significant feature should be prototyped first.)

vansnyder commented 3 years ago

On Wed, 2021-07-07 at 07:37 -0700, Steve Lionel wrote:

My observations on the history of Van's proposal... J3 and WG5 repeatedly discussed the UNITS proposal over the years. The most recent was the 2013 Delft meeting where we spent an hour or more reviewing the proposal.

The most recent discussion was at Boulder in 2016, not at Delft in 2013. From Delft 2013 minutes (N1977):

Van Snyder gave a presentation "Units of Measure in Fortran" which proposed a new TS which would add the ability to check physical dimensions and measurements in Fortran. The slides are in N1970 and a draft TS is in N1969. It was emphasized that these were the result of many years of work and discussion with engineers at JPL. A vote on the subject was postponed to allow further informal discussion.

There were votes on whether WG5 should apply for a new project to develop a TS on units of measure, following the presentation on Wednesday. The individual vote was: yes 3 - no 7 - abstain 1. The country vote was: yes 0 - no 3 (CA, JP, GB) - abstain 3 (DE, NL, US).

From the 2015 London meeting minutes (N2068): In 2013 Van Snyder had introduced a draft TS on Units of Measure in Fortran (ref N1969, N1970, N1977). It would be useful for his sponsors to know why WG5 had not adopted this project. The convenor agreed to provide a document with the reasons. and resolutions (N2067): L10. Units of Measure in Fortran WG5 directs its convenor to provide a document describing the reasons why WG5 did not apply for a new project to develop a TS on Units of Measure in Fortran when requested to do so at its 2013 meeting. From the 2016 Boulder minutes (N2109): The action in resolution L10, that the convenor should provide a document describing the reasons why WG5 did not apply for a new project to develop a TS on Units of Measure in Fortran, was still to be completed. That report has still not been produced. We most emphatically did NOT spend an HOUR reviewing and discussing the proposal at either the 2013 Delft meeting. the 2015 London meeting, or the 2106 Boulder meeting. I gave a brief presentation (about 15 minutes) at Delft (N1970), but discussion was more like FIVE MINUTES. There was no discussion of the technical merits or members' technical objections, at either meeting, or any other meeting, at least not any in which I was included. That there was no such discussion at or after Delft 2013 was the reason for L10. I was present in every plenary session in every WG5 meeting but one from February 1997 until last year. I had proposed it informally before Delft. Was it discussed at Markham in 2012, the only WG5 meeting I did not attend in 23 years?. If so, why was I not informed of the results of that discussion? There is no mention of it in the minutes of the Markham meeting (N1926). There has been very little off-line discussion. I received one minor remark in e-mail from Dick Hendrickson and one from Malcolm Cohen. These were proposals for improvement, not technical objections.

My notes say:

"Straw vote on whether to ask SC22 to authorize work item for units TS. 3/7/1 abstain

US abstain, UK No, JP No, NL abstain, Canada No, Germany abstain Not going to do it." A big problem with a TS is the presumption that it will be part of the next standard revision.

That is precisely the reason that I revised the paragraph in N1969 that included the same promise as in the allocatable, IEEE, and submodule TR's: It is the intention of ISO/IEC JTC1/SC22/WG5 that the semantics and syntax specified by this technical report be included in the next revision of the Fortran standard without change unless experience in the implementation and use of this feature identifies errors that need to be corrected, or changes are needed to achieve proper integration, in which case every reasonable effort will be made to minimize the impact of such changes on existing implementations. Off-line discussion had suggested that such a promise was required by ISO rules. But a TS, for additions to C, included a different discussion of conformance, and in fact that promise is not required by ISO rules. From C PDTR 18037 (WG14 n3574): As this is a technical report, there are no conformance requirements and implementers are free to select those specifications they need. However, if functionality is implemented from one of these sections, implementers are strongly encouraged to implement that section in full, not just a part of it.

If, at a later stage, a decision is taken to incorporate some or all of the text of this Technical Report into the C standard, then at that moment conformance issues with respect to (parts of) this text need to be addressed. In the final proposal to publish N2113 as a TS, during the 2016 WG5 meeting at Boulder, I pointed out that the "promise" in N1969 had been replaced: This technical specification is non-normative. Some of the functionality described by thisTechnical Specification may be considered for standardization in a future revision of ISO/IEC 1539, but it is not currently part of any Fortran standard. Some of the functionality in this Technical Specification may never be standardized, and other functionality may be standardized in a substantially changed form. Addressing this procedural objection using this revision was met with silence. No technical objection has ever been raised, at least not in my presence. No one has commented, in any way, on N2113.

I and others did not like the specific proposal, which to me was very "F77-like".

There was, and still has not been, any discussion of technical objections to the proposal (See L10 and the Boulder 2016 minutes). I was present in every plenary session at every WG5 meeting other than 2012 at Markham, and I did not hear anyone say "F77-like." I have no idea what that means. Fixed form? Insignificant blanks? No dynamic memory? No derived types? No modules? No array operations?

I suggested to Van that he should create a trial implementation using derived types and then see how it works when used in an application. He didn't want to do that.

And I pointed out that Grant Petty already did it -- several times during the preceding decade. It has substantial runtime cost and substantial labor cost, and doesn't address the problem as fully as the TS proposal. In particular, it does not distinguish between different scales of the same unit, which was precisely the problem that doomed the Mars Climate Orbiter (different measures of momentum). Addressing scale would increase both run-time and labor costs. Units' exponents are not checked at compile time. There is no provision for abstract units. There is no provision for base units other than the ISO base units. If exponents of base units are represented using kind type parameters, instead of components, the exponents (but not the scales) of derived units are checked at compile time. One must replicate all arithmetic operations, all related intrinsic functions, and many of the functions of the user's program, not just for one type, but for all expected combinations of kind type parameter values. There is an exponential explosion in the bulk of code, the cost to develop it, and the cost to maintain it (this would be somewhat addressed by parameterized modules). If functions become type-bound, enormous labor cost is required to revise their syntax of reference (that is a different problem for which I offered solutions -- which were rejected -- in 1986 and 1999). The purpose of a high-level language is to reduce labor cost and increase reliability, while retaining efficiency. Increasing reliability using minor increases in labor cost, and entailing no run- time penalty, are acceptable. Proposals to increase reliability by increasing labor cost substantially and reducing efficiency substantially are not to be taken seriously, especially if they do not and cannot address the problems well.

Another suggestion was to fork gfortran and add the feature, but that has a steep learning barrier even for an experienced compiler developer, which Van is not.

I don't pretend to be a Fortran developer. I participated in development of three Fortran-related preprocessors. But I have written processors for several smaller interpreted languages. Units checking, including automatic scale conversion, was included in several of those, although not as comprehensively as proposed in N2113.

Personally, I don't have an objection to the concept, but as a former compiler developer I can see how this would be a lot of work for something I expect few programmers would use.

I do not expect units to be used in a compiler, MySQL, an OS kernel, or cell-phone app. Maybe in a video game. When I worked in the Applied Mathematics Group at JPL, we provided libraries of mathematical software, and consultancy on mathematical and software methods to address scientific and engineering problems. My experience during that quarter century, and the next quarter century during which I developed scientific and engineering software, was that I and about 600 of my colleagues frequently made masteakes in the use of units. It was that experience, and the urging of our colleagues, that led our group to propose to Colonel Whittaker in 1976 that the requirements for the language that became Ada should include units facilities. Colonel Whittaker couldn't understand why one would want software libraries, let alone checking something as important as correct usage of units, in a language explicitly designed for reliability. "I used a software library once, and it got the wrong answer."

You can do almost everything desired with derived types and defined operators, and I would really like to see someone model this as such before asking WG5 to add it to the language.

As I remarked above, and several times during the preceding decade, Grant Petty already did this. If someone has an idea for something other than Grant Petty's project, I would like to see someone explain it using a bit more than this vague handwaving. Grant Petty's module has significant runtime cost and significant labor cost, and doesn't address the problem well. I have not seen a discussion of or proposal for a method using "derived types and defined operators" that circumvents these problems. And N2113 didn't promise to add it to the language. I believe the fear was that if N2113 were published as a non-normative non-binding TS, more people who use Fortran than just those who participate in standards committees (as opposed to developers) might see the value of it, and ask WG5 to incorporate it into a future standard.

If one of the OSS compilers wants to try it, great!

Publishing a TS is the traditional way to specify how to "try it."

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

[ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": " https://github.com/j3-fortran/fortran_proposals/issues/50#issuecomment-875659896 ", "url": " https://github.com/j3-fortran/fortran_proposals/issues/50#issuecomment-875659896 ", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

vansnyder commented 3 years ago

On Wed, 2021-07-07 at 08:19 -0700, Joseph Schoonover wrote:

I'm curious to understand what the objection is to implementing units handling and file IO in a library rather than implementing at the compiler level.

  1. Errors that could have been detected at compile time are raised at run time. Errors raised at run time can have catastrophic impact, and occasionally cannot be corrected. For example, the developer has died or retired, the company, division, or group has closed, or the source code has been lost.
  2. Labor cost is substantially greater.
  3. Run-time penalties are significant.

    Why doesn't a cross-organization group form to make something viable that doesn't require a change in the Fortran standard. IMO, this seems to be the path of least resistance, unless there are other barriers to getting something like this done and well supported (e.g. funding).

    From this conversation, the "units as a standard" request is coming from and supported by NASA and DoD contractors but doesn't appear to have support beyond these kinds of groups. While I recognize tracking units correctly in code and in file IO is important and currently error prone, perhaps its time we rethink our software development strategy, rather than modifying a standard that impacts a broader community.

If units checking were added to the Fortran standard as proposed in WG5 N2113, and implemented in processors, it would have no impact on existing codes, and no impact on codes that choose not to exploit it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

[ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": " https://github.com/j3-fortran/fortran_proposals/issues/50#issuecomment-875695027 ", "url": " https://github.com/j3-fortran/fortran_proposals/issues/50#issuecomment-875695027 ", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

certik commented 3 years ago

@vansnyder my understanding is that the main feedback from WG5 is to implement this in a Fortran compiler as a prototype (which has never been done yet).

What has been done is to prototype this using derived types and operators, e.g., by Grant Petty. I have found the following references to his work:

vansnyder commented 3 years ago

On Wed, 2021-07-07 at 14:47 -0700, Ondřej Čertík wrote:

@vansnyder my understanding is that the main feedback from WG5 is to implement this in a Fortran compiler as a prototype.

What has been done is to prototype this using derived types and operators, e.g., by Grant Petty. I have found the following references to his work:

https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.401 (pdf) The paper provides a link: http://meso.aos.wisc.edu/~gpetty/physunits.tar.gz to the module and sample programs (the link does not work anymore, but I linked it via the wayback machine which seems to work)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

We already have experience with Grant Petty's package. It doesn't check exponents of base units at compile time. It can't do anything about different scalings of the same derived unit, such as momentum. It requires more labor. It imposes run-time penalties.

I realize that the proposal in N2113 would require more development work within the compiler. I suspect that at least some of it would be handled by an extension of the work that was required to allow more than one kind type parameter for a variable. In many ways, the UNIT specification acts like a kind type parameter. Tests for structural compatibility would be entirely new. Calculating units by combining factors within a term would be entirely new. Creating conversion functions would be entirely new. I/O would be entirely new. Even with all this entirely new work, I suspect the effort would be substantially less than the work required for coarrays or C interop.

certik commented 3 years ago

I realize that the proposal in N2113 would require more development work within the compiler. I suspect that at least some of it would be handled by an extension of the work that was required to allow more than one kind type parameter for a variable. In many ways, the UNIT specification acts like a kind type parameter. Tests for structural compatibility would be entirely new. Calculating units by combining factors within a term would be entirely new. Creating conversion functions would be entirely new. I/O would be entirely new. Even with all this entirely new work, I suspect the effort would be substantially less than the work required for coarrays or C interop.

Yes. It is my understanding that WG5 recommends to implement it in the compiler as the next step.

Yes I think it would not be that difficult, and I know at least @klausler and myself would be interested in attempting it. Hopefully more people would help.

magicmouse commented 3 years ago

Using Van Snyder's proposal as a starting point, i added Units of Measure to my Beads language. In the syntax, i allow a new datatype that of meas, which has the normal engineering unit families of angle, mass, length, area, pressure, energy, etc. In each unit family there are various units, with either constant scaling factors that relate a unit like kilometer to the meter, or specifies a function to go to and from the base unit. The user can create their own unit families, with a unique set of fundamental primitives like mass/length/ etc, and is limited to ratio of integer exponents. During compile time and runtime the measurement internally consists of a record consisting of a magnitude, the unit family it belongs to, and the current array of exponents in a canonical form. As you multiply quantities the exponents add. The user can thus conveniently write 3 kg + 2 lb + 3 g, and it will automatically do the units conversions. 3 kg * 4 m / 2 sec would remember the exponents. As there are only a few operators, it is not a big effort to carry these units around, and one can always uses scalars in large matrics so as to not burden the system.

Anyone claiming that we can't tolerate any extra overhead in calculations to ensure they are correct has not been paying attention to the over 1 million to 1 reduction in the cost of computation, since i was using the Univac 1108 at JPL in 1971. And let's not forget that stupid BitCoin used terawatt-hours of electricity doing useless calculations in a proof-of-waste scheme that is the greatest peacetime waste of resources in modern times.

The complexity to implement physical of units is fairly small. You have to handle meas meas, scalar meas, and one needs a few conversion functions so you can get a value to and from the preferred unit. The compiler has to track quite a bit more information (the array of exponents, the unit family, etc.) as it does its type calculations, but it isn't that hard, and compared to the fancy code optimizations that exist in the back-ends of compilers, this front end work is not a difficult task.

That the FORTRAN committee ignores the brilliant work of Van Snyder, whose long tenure at one of the crown jewels of the US high tech industry, and ignores the pioneering work of Fermat who invented the technique of checking units to proofread formulae in the 1800's is inexcusable inertia.

It is faster to write a whole new language (it took me 4 years so far) than to get a committee off its rear ends. C. Northcote Parkinson wrote extensively about how committees function, and perhaps the committee is too large; i think he set the limit at 8 or 12 i can't remember.

magicmouse commented 3 years ago

And with regards to proposal such as adding refinement types: by the time you have added sufficient abstraction power to have families of related units, and tracking exponents, you have built a more complex type system than Haskell or any other Category-theory laden language. The point is not to turn FORTRAN into a super-Haskell, but to gracefully and minimally extend FORTRAN to support his vital function.

Addition of this new data type is slightly more work than adding COMPLEX numbers was; you convert a single scalar value into 2 values in complex numbers, for a unit of measurement it is a record of a few fields. Since there are only 5 operations: addition, subtraction, multiplication, division, and exponentiation to fractional powers, that is only 5 runtime functions to invoke. It's just not that hard.

certik commented 3 years ago

@magicmouse thank you for your comments. Just a reminder that we have a Code of Conduct: let's criticize the committee if you think they have not done a good job sometimes, but let's not be demeaning about it.

Regarding performance:

Anyone claiming that we can't tolerate any extra overhead in calculations to ensure they are correct has not been paying attention to the over 1 million to 1 reduction in the cost of computation, since i was using the Univac 1108 at JPL in 1971.

It is my understanding that unit would not have any runtime overhead, except perhaps a very small overhead in IO (if you use them). Is that not the case?

Runtime overhead would be unacceptable for me. Yes, computers got faster, but that's besides the point: if Fortran suddenly intrinsically becomes slower for array operations, then somebody else will write a language that is faster. The way Fortran is designed is that it allows the compiler to (in theory) extract high percentage of the available performance, there are no intrinsic features that prevent performance. Well, there is the new reallocate LHS feature that possibly slows things down (and I am against it for the same reason, but that's another discussion). The point us: just the fact that hardware is fast compared to the past does not mean that we can waste cycles: if that was true, then everybody would be using Python for HPC. I love Python, and use it precisely because it is easy. It's often at least 200 times slower than Fortran, and yet computers have got 200 faster since the first time I used Python: one can definitely use Python now and be faster than any Fortran code 30 years ago --- and yet people including myself still want to use Fortran today, not Python, for high performance computing. As possibly the single core performance will stop getting faster in the next 10 years or so, I think performance of compilers will become even more important.

vansnyder commented 3 years ago

On Thu, 2021-07-08 at 06:26 -0700, Ondřej Čertík wrote:

@magicmouse thank you for your comments. Just a reminder that we have a Code of Conduct: let's criticize the committee if you think they have not done a good job sometimes, but let's not be demeaning about it.

Regarding performance:

Anyone claiming that we can't tolerate any extra overhead in calculations to ensure they are correct has not been paying attention to the over 1 million to 1 reduction in the cost of computation, since i was using the Univac 1108 at JPL in 1971.

It is my understanding that unit would not have any runtime overhead, except perhaps a very small overhead in IO (if you use them). Is that not the case?

That was the intent of my design. The only runtime overhead is where there is an explicit conversion between scales of the same base (or derived) unit. This is done by a function reference. The function is created automatically by the unit declaration. If your program deals with different scales -- Newton-seconds as opposed to pound-seconds, for example -- you would need to write and invoke these functions anyway.

Runtime overhead would be unacceptable for me.

For me, at least in the last project before I retired, it would be unacceptable too. The project has a small (by today's standards) computer, consisting of 384 cores. It is analyzing data from an Earth- observing instrument that returns 500 million measurements of microwave thermal emission from the atmosphere every day. By inverting the radiative-transfer equation using a Newton iteration, it produces about five million measurements of temperature, humidity, and about fifteen minor constituents of the atmosphere -- such as ozone -- on 3500 profiles at 72 pressure levels between 8 and 80 kilometers altitude, every day. The instrument scans only in the orbit plane, so we get a view of a 2-D slice of the atmosphere (http://mls.jpl.nasa.gov). The next proposed instrument would scan like a TV, providing a 3-D view of the atmosphere. It would return 400 times as much data. Making the program 3-10 times slower by computing things at runtime that could be computed at compile time would not be acceptable.

Yes, computers got faster, but that's besides the point: if Fortran suddenly intrinsically becomes slower for array operations, then somebody else will write a language that is faster. The way Fortran is designed is that it allows the compiler to (in theory) extract high percentage of the available performance, there are not intrinsic features that prevent performance. Well, there is the new reallocate LHS feature that possibly slows things down (and I am against it for the same reason, but that's another discussion). The point us: just the fact that hardware is fast compared to the past does not mean that we can waste cycles: if that was true, then everybody would be using Python for HPC. I love Python, and use it precisely because it is easy. It's often at least 200 times slower than Fortran, and yet computers have got 200 faster since the first time I used Python: one can definitely use Python now and be faster than any Fortran code 30 years ago --- and yet people including myself still want to use Fortran today, not Python, for high performance computing. As possibly the single core performance will stop getting faster in the next 10 years or so, I think performance of compilers will become even more important.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

[ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": " https://github.com/j3-fortran/fortran_proposals/issues/50#issuecomment-876438689 ", "url": " https://github.com/j3-fortran/fortran_proposals/issues/50#issuecomment-876438689 ", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.": "Organization", "name": "GitHub", "url": "https://github.com" } } ]