Open everythingfunctional opened 2 years ago
@everythingfunctional ,
SIMPLE
procedures, as developed in Fortran 202X, are not limited by constant expression considerations.
Please see this #214 which proposes CONSTEXPR
procedures instead for user-defined functions in constant expressions. There is a reason for this: there are certain difficult complications with Fortran standard semantics when it comes to allowing executable statements and constructs in constant expressions.
Consider the string%str_ = str
statement in your example code: I will spare commentary but you know there is a lot "underneath" this statement with allocation-upon-assignment that many a implementation handles only during run-time. For these, what you propose is effectively a no-no.
I have surmised improved compile-time computing is only amenable in Fortran with a rather limited set of instructions. Having the ability to define them as CONSTEXPR
procedures that essentially only include constant expression statements themselves which then enable certain code reuse and yield all the attendant benefits will be a good one to achieve for enhanced compile-time computing in Fortran 202Y.
Hence it will be helpful if you and others who see value in this proposal would also give a thumbs up for #214.
Thank you,
@FortranFan , you honed right in on the complications and saw through my attempt not to specifically mention them. However, your assumption that constant expressions must be known at compile time is not quite true. In fact, the standard says nothing about compile time. It would be perfectly reasonable for constant expressions to be computed immediately prior to program execution (i.e. at program startup). In fact there are languages which do this. I believe initializing a constant value of a type with pointer members in C++ (i.e. it's components must be allocated on the heap), or even just a user defined constructor, is done exactly this way.
Part of my reason for asking for this feature is specifically because I want to be able to define constants of types with allocatable
components. Let alone be able to have those types keep their components private. I'll admit there might be some complications involved with this that I'm not quite seeing, but as far as being able to implement such a feature, I think it's absolutely feasible within the current standard.
@everythingfunctional wrote Mar. 16, 2022 1:15 PM EDT:
your assumption that constant expressions must be known at compile time is not quite true. In fact, the standard says nothing about compile time. It would be perfectly reasonable for constant expressions to be computed immediately prior to program execution (i.e. at program startup). In fact there are languages which do this.
Note
I really think an attempt to get the standard to allow SIMPLE functions in constant expressions will not gain traction with the committee.
Poor persevering programmers preparing processors for practitioners can, perhaps, put some run-time "constant" expressions into play (initializer expressions), but perusing your published PDF produces a plethora of prohibited possibilities. A plurality of points where "scalar-int-constant-expr" appears must positively produce proper constants, per se, post parsing. Ponder real(kind=simplefunc()) :: x
patiently.
real(kind=simplefunc()) :: x
I see. This actually does pose a potential problem if the expression cannot be fully evaluated at compile time. But I still contend that CONSTEXPR
would restrict one to what is effectively already doable via named constants (i.e. the parameter
attribute).
Perhaps I'm trying to apply this idea too broadly by utilizing an already existing aspect of the language. Perhaps what I'm really after is a new designation, something like "initialization expression". These could include the use of simple
functions, and many places that require constant expressions now, could be relaxed to allow for initialization expressions. The kind value for an intrinsic type would then be a place that still requires a constant expression.
real(kind=simplefunc()) :: x
In LFortran we do two passes over the AST (Abstract Syntax Tree, the result of parsing), in the first pass we figure out types of all functions and variables (symbols), and we evaluate an expression like kind = simplefunc()
, and currently such expressions can only use constants, arithmetic operations and intrinsic function calls, so it can be straightforwardly evaluated in the compiler itself, without executing any Fortran code (for example if you call sin(x)
, then we call our own implementation in C++). Then in the second pass we finish compiling the bodies of all functions and resolve all variables to (now existing) symbols in the symbol table.
This approach would obviously break if simplefunc()
is a user defined function, as it would require to compile this function first and execute it, while figuring out the types of symbols.
However, I think that can be done, I think one can allow executing of any function at compile time: as long as the function(s) can be actually compiled (all the types are known), then we can execute them while still compiling other code. Perhaps we can restrict it to only allow calling user defined functions from other modules, so we can still use the above two pass approach for a given module, but if another module is already compiled, we can execute functions from it. For example the Jai language allows executing of any code (including the whole program!) at compile time.
There is also a security and a performance implication: right now no user code is being run at compile time, only code that is already implemented in the compiler, so as long as there are no bugs in the compiler, it is currently safe to compile the whole code. Once we allow executing any random code at compile time, all kinds of things can happen:
I may be an idiot, but how does using SIMPLE functions in initialization expressions work in a world of separate compilation? Only the interface may be known at compile time, not the implementation.
how does using SIMPLE functions in initialization expressions work in a world of separate compilation?
The expression isn't computed at compile time, it's computed at run time, prior to beginning execution of the main program.
In fact, it's a bit odd to me that compilers felt it acceptable that
real, parameter :: y = sin(1.0)
print *, y
and
real :: x, y
x = 1.0
y = sin(x)
print *, y
might produce different outputs, since one math library would be used at compile time, but a different one used at run time.
I did think of a case that a proposal will need to prevent, a named constant defined in terms of a simple function
that references it. I.e.
module foo
implicit none
integer, parameter :: UH_OH = bar()
contains
simple function bar()
integer :: bar
bar = UH_OH
end function
end module
Yeah, that was the sort of thing I was thinking.
might produce different outputs, since one math library would be used at compile time, but a different one used at run time.
With optimizations enabled, as a user I would expect both to produce a compile time value, in this case identical value computed using the compile time (slow but more accurate) math library.
As a user, I'd expect the following program to always output "T", no matter what. But what are the actual chances of that? You can easily link to a math library that is different from the one used by the compiler. The choice of optimizations and link time options should not affect whether this program outputs "T" or "F". x = sin(1.0)
should always be computed at program startup time, not compile time. If that's true, why can't user defined simple function
s also be used there?
program huh
implicit none
real, parameter :: x = sin(1.0)
real :: y
y = 1.0
print *, x == sin(y)
end program
Regarding your last example, one approach could be that the compiler only uses the "more accurate" math library when optimizations are enabled, thus this should still return T in most simple cases. In the standard "Debug" mode, the compiler can call the exact same math library for both compile time and runtime, thus also returning T.
On a broader issue, I never compare floating point numbers directly like this, but always with abs(x - sin(y)) < eps
, in which case this will return T no matter what.
the compiler can call the exact same math library for both compile time and runtime
The compiler doesn't know the runtime math library. I.e.
$ gfortran -c huh.f90 -o huh.o
$ ld huh.o -lmkl -lgfortran -o huh # I don't know exactly what all the necessary libraries are, but you get the idea
As a user, I'd expect the following program to always output "T", no matter what.
That expectation turns out to be impossible in the face of separate compilation and link steps.
That expectation turns out to be impossible in the face of separate compilation and link steps.
I think that's an artifact of current implementations, not a required implication of the standard. Nothing says that sin(1.0)
must be evaluated at compile time, even in a constant expression. That calculation could be deferred to program startup and still be in conformance with the standard.
As @klausler said, if you want separate compilation and link steps, and as you @everythingfunctional showed that you can choose the runtime library later, it's hard to do, except deferring the evaluation of sin
to runtime, but the problem is, as @klausler's example above (real(kind=simplefunc()) :: x
) shows, sometimes you need to evaluate it at compile time, you can't defer it to runtime. I do think in most practical cases, such as your floating point example, things can be done at runtime though, just not in all cases.
real(kind=merge(kind(0.d0), kind(0.), sin(1.0) < cos(1.0)) :: x
real(kind=merge(kind(0.d0), kind(0.), sin(1.0) < cos(1.0)) :: x
That one is at least definitely standards conforming. How about this one though?
real(kind=real_kinds(int(sin(1.5707963)))) :: x
Whether or not that is standards conforming depends on the quality of the implementation of the sin
function used at compile time. IMHO the allowance of implementation dependent functions in constant expressions was a mistake. To allow that mistake to get in the way of adding a feature to the language that implies deferring calculations of some constants to program startup time is a further mistake in my mind.
selected_real_kind
is implementation-dependent, but I never see it used outside a constant expression that's necessary for typing at compilation time.
It is a strength of Fortran that nearly every intrinsic function, including most functions in the IEEE intrinsic modules, is available for use in constant expressions. It takes a huge amount of effort to implement them fully -- see f18's here for an example that's nearly complete -- but it gives Fortran one of its few advantages over C++.
selected_real_kind
is implementation-dependent, but I never see it used outside a constant expression that's necessary for typing at compilation time.
It seems I keep making broad generalizations where I should be more reserved and nuanced. I do appreciate all the insights this discussion has elicited.
It seems defining what I'm after will require more effort than I initially anticipated, but I'm still convinced much of it should be technically feasible and desirable. Being able to write what I have in my initial example is still something worth working towards I think.
@everythingfunctional if you are ok with manually inlining your functions, it looks like you can already do quite a bit at compile time: https://fortran-lang.discourse.group/t/computing-at-compile-time/3044
That is a cool demonstration, but it still lacks two features I'd like.
- to be able to reuse somebody's existing library without having to manually inline all their calculations
Allowing statement functions in constant expressions and modules would cover most of the use cases for that need, and would be way easier to implement than a constexpr function. Yes, I know about statement functions being obsolescent, but they've been in the language since literally day 1 (longer than functions and subroutines) and aren't going anywhere.
- be able to define constants of derived types with private and/or allocatable components
What's wrong with using a function from the type's definition module to construct and return these, other than being unable to reference that function from an initialization expression?
other than being unable to reference that function from an initialization expression
that's exactly what I'm after
Allowing statement functions in constant expressions and modules
That's an interesting idea, but are statement functions pure
or simple
? If you could designate them simple
, I think that could be workable.
Statement functions can be considered pure if they reference only pure functions. They're just wrappers around expressions, and contain no statements per se. They can similarly also be considered simple if they reference no variables other than their arguments. Either way, it's a trivial derived attribute.
other than being unable to reference that function from an initialization expression
that's exactly what I'm after
Initializers of variables, default initializers of components, or both? They're not exactly the same problem, and may have distinct solution options.
Both. But I think allowing it in default initialization of components requires at least thinking about the context of initialization of variables. For example
type :: foo
integer :: bar = baz()
end type
type(foo), parameter :: buzz = foo()
Ordinarily the intrinsic structure constructor has optional arguments for components with default initializers, and they can be used to initialize a named constant of that type. So by allowing simple function
s to be used in default initialization of components, you're kind of forced to allow them for named constants by proxy. That or carve out a weird exception for types who's default initializers aren't constant expressions.
I think a spec like:
An initialization expression may contain a constant expression, or simple functions with actual arguments that are themselves initialization expressions.
An initialization expression can be used to define the value of a named constant, the initial value of a save
d variable or the default value of a component of a derived type.
A named constant whose value is not a constant expression may not be used in a constant expression.
That last statement is the weird bit with strange complications/implications, but gets around the problems in real(kind=simplefunc()) :: x
because the kind parameter still requires a constant expression.
There's probably still some problematic aspects I haven't considered somewhere in there, but it seems like the right direction.
Just wanted to share experiences from the C++ world involving this kind of stuff. https://youtu.be/OcyAmlTZfgg
I just wanted to add a note about implementation considerations for constexpr
-like functions in Fortran.
I did a bit of research with some C++ aficionados about the implementation of constexpr
. It sounds like all the implementations evaluate the function calls at compile time using an interpreter that can emulate almost all of the C++ language. There are a few remaining restrictions (related to calling non-constexpr
functions and variables, allocating memory that is not freed in the same call, throwing exceptions and such).
I believe for Fortran modules, it requires placing an interpretable version of the function in the .mod
module file for later use and call by code compiled elsewhere. (I don't think we can call the compiled function, as we probably don't know where it is.)
I believe implementors will say the investment is quite high, and the return on that investment is not well-motivated yet.
I think it can be done in the compiler. The question is more if users want it, as well as what are the guidelines of features that we should not put into Fortran (every new feature has a "cost", etc.).
The 202X standard will include a new attribute for procedures,
simple
. If a procedure issimple
it depends on, and uses only its arguments, and calls onlysimple
procedures. This means that, in theory, if asimple function
's arguments are themselves constant expressions, it could be evaluated before program startup. I contend that this would be quite valuable for defining named constants of derived types whose components are private. For example given a string type defined likeIt should be possible to define something like the following