Closed milancurcic closed 3 years ago
Do you think there will be much use for passing arrays to the xmin
or xmax
arguments?
If not, I would propose a fourth implementation option:
The procedures are auto-generated for each rank and type (real, double, ...)
interface clamp
module procedure clamp_rank0
module procudure clamp_rank1
module procedure clamp_rank2
...
end interface
function clamp_rank0(x,xmin,xmax)
! ... use min max or ifthen/else version
end function
function clamp_rank1(x,xmin,xmax) result(res)
real, intent(in) :: x(:)
real, intent(in) :: xmin, xmax
real :: res(size(x))
where (x < xmin)
res = xmin
elsewhere (x > xmax)
res = xmax
elsewhere
res = x
end where
end function
The same logic applies to where/elsewhere
logic applies to higher ranks.
Concerning name I also have a personal preference for clip
. On the other hand consistency with C++, Julia, and Rust does have some weight. I would also vote in favor of stdlib_numeric
vs stdlib_math
.
Is the usefulness of a where clamp to avoid it being elemental, like in https://github.com/fortran-lang/stdlib/pull/310#issuecomment-776978642?
If yes, is it so that clamp could be passed as a procedure pointer? If yes, it makes me think--how do you do it with a generic? I thought that generic can't be used as a procedure pointer because you need to specify the procedure interface in the list of dummy arguments. Or are there other benefits to non-elemental procedures?
Do you think there will be much use for passing arrays to the xmin or xmax arguments?
I don't think much but maybe some. I used it a few times in Python for 2-d arrays to clamp only within a radius and to do a masked clamp. There are alternative ways to do that though.
But this could also be provided through specific functions that define xmin
and xmax
as arrays. So if we chose to do a generic implementation instead of elemental, then allowing xmin
and xmax
to be arrays would only double the number of specific functions.
Is the usefulness of a where clamp to avoid it being elemental, like in #310 (comment)?
No, in this context I was only playing with the idea of a generic but non-elemental interface to forbid usage such as:
res = clamp(0.1,[0.2,0.3,0.4],0.5)
which if the implementation were min(max(x, xmin), xmax)
would return the 3-by-1 vector [0.2,0.3,0.4]
.
I haven't used the clamp
function before to be able to say is it useful or not. Just throwing the concern out for discussion.
I see, that didn't occur to me. I can't think of an use for it. I agree that we should somehow forbid it or at least discourage it. I see a few solutions:
xmin
and xmax
are scalar at run-time and stop if they are notI would prefer to avoid run-time checking. This would make the procedures impure
so I don't see any benefit. Having the routine return without warning also make no sense.
This leaves us with the first and third bullet points, meaning either pure
procedures for different ranks with a generic interface, or the elemental
version (either min/max or the if-then-else versions). Actually it seems like a great experiment to learn about potential performance differences. :+1:
Am I right that according to the specified behavior positive Inf values are mapped to xmax
, while -Inf is mapped to xmin
? NaN values however are preserved?
I would prefer to avoid run-time checking. This would make the procedures impure so I don't see any benefit. Having the routine return without warning also make no sense.
Yes, I would too. We can error stop
from a pure
a procedure (F2018) but it'd still need a check which would hinder performance.
My heart prefers the elemental approach (it's one of my favorite features of Fortran) but my head prefers the generic where clamp approach (compile-time enforcing of argument shape). Let's discuss it more broadly on the upcoming call.
My heart prefers the elemental approach (it's one of my favorite features of Fortran) but my head prefers the generic where clamp approach (compile-time enforcing of argument shape). Let's discuss it more broadly on the upcoming call.
I wonder if this is something the standard should address - being able to explicitly enforce that elemental
affects only certain variables (perhaps through an attribute such as nonelemental
or scalar
).
Hello all, I was just thinking if there would be any need for a procedure that would scale the value of an input parameter into an output value defined by min/max parameters. This could be used, for example, scaling an input scalar (or array) into a scaled value describing some physical quantity, such as temperature between defined limits (say -50 ℃ and 100 ℃).
My current implementation with optional upper and lower bounds and 'hint' to the user if lower>upper (inspired by numpy). Probably error stop
would be better than NaN, but I'm not clear about performance implications.
real(pReal) pure elemental function math_clip(a, left, right)
real(pReal), intent(in) :: a
real(pReal), intent(in), optional :: left, right
math_clip = a
if (present(left)) math_clip = max(left,math_clip)
if (present(right)) math_clip = min(right,math_clip)
if (present(left) .and. present(right)) &
math_clip = merge (IEEE_value(1.0_pReal,IEEE_quiet_NaN),math_clip, left>right
end function math_clip
@art-rasa Yes, I think so, I've done this many times in the past (home cooked) and is very common in machine learning. Can you propose it in a separate issue? Ideally with an API, name suggestion, and example implementation.
After discussing the elemental vs. where-clamp on the monthly call, I'm now more convinced that the elemental approach is just fine. The thing is, a where-clamp is a not so pretty hack to constrain how the user can invoke the function, and we're assuming that no user will want to use it exactly in the way that we would be preventing. So I'd rather have a more liberal and elegant solution, and point this out to the user in the docs.
@Milan Curcic caomaco@gmail.com I concur. It is difficult to foresee use patterns and most people will simply use it with scalar bounds, I imagine. Actually,before you brought it up, I did not even think of using an array for the bounds, though now that I write this, it is simply an extension on min(a,b) and max(a,b) with both arguments an array and that is certainly something I use from time to time!
Op wo 3 mrt. 2021 om 22:29 schreef Milan Curcic notifications@github.com:
After discussion the elemental vs. where-clamp on the monthly call, I'm now more convinced that the elemental approach is just fine. The thing is, a where-clamp is a not so pretty hack to constrain how the user can invoke the function, and we're assuming that some user may not want to use it exactly in the way that we would be preventing. So I'd rather have a more liberal and elegant solution, and point this out to the user in the docs.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/319#issuecomment-790067673, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YR344ONAOTQPFQVQXG3TB2S35ANCNFSM4XRNNGQQ .
What if user doesn't know for sure that the 2nd argument (xmin) is <= 3rd argument (xmax) but knows that this is the clip range in which the output should be? for example: x = 9 xmin= 11 xmax = 10 answer expected by user = 10 I don't have specific example of an application where user might need to do it, but it occurred to me as a possibility.
What if user doesn't know for sure that the 2nd argument (xmin) is <= 3rd argument (xmax) but knows that this is the clip range in which he/she might the output to be? for example: x = 9 xmin= 11 xmax = 10 answer expected by user = 10 I don't have specific example of an application where user might need to do it, but it occurred to me as a possibility.
I would say that this is invalid use of the function, the user would simply need to ensure that this does not happen. It's not an uncommon situation that things are forbidden (square root of a negative real number, division by zero, exceeding the range of a certain datatype, ...).
To me, the more interesting question is how to handle this. Numpy does not check, probably for performance reasons. My implementation exits with error stop
because I think that the time I spend on debugging is much more valuable then the time my computer spends on doing calculations.
OK, let's leave it on user to provide smaller number as xmin and larger number as xmax.
I liked the suggestions @MarDiehl has given above:
My current implementation with optional upper and lower bounds and 'hint' to the user if lower>upper
I think this provides user more enhanced feature in one function i.e. a user can give one of the two optional argument as None indicating that he/she doesn't want to use one of the two clips (xmin and xmax). If None is passed in both then it doesn't make any sense (we can raise a warning or return the input as the output)
@MarDiehl can you please explain what this line does?
if (present(left) .and. present(right)) & math_clip = merge (IEEE_value(1.0_pReal,IEEE_quiet_NaN),math_clip, left>right
To me, the more interesting question is how to handle this.
I thought of doing it by making a comparator function (func1) which gives -1, 0 or 1 as the output. If we multiply func1(x, xmin) with func1(x, xmax) and the answer is 1, then we know that the output will NOT be x. Otherwise output will be x. Please also let me know if you are thinking of any other implementation than this. OR we can just set xmin as min(xmin, xmax) and xmax as max(xmin, xmax)
To me, the more interesting question is how to handle this. Numpy does not check, probably for performance reasons. My implementation exits with error stop because I think that the time I spend on debugging is much more valuable then the time my computer spends on doing calculations.
I think it's important to have both--allow it to error stop
in debug mode, and don't check in release mode. Tiny numerical functions like this should be HPC-useful. We can do this with a preprocessor.
Hey!, how can I contribute now to get this issue closed?
Okay, here are my suggestions:
clip
.stdlib_math
or stdlib_numeric
are appropriate, and I have slight preference for stdlib_math
.clip
should work on integers and reals of all kinds (int8
, int16
, int32
, int64
, real32
, real64
, real128
), so there will be 7 specific functions, and one generic name. Specific functions will be generated by the fypp preprocessor. For example, take a look at stdlib_stats.fypp. Similarly, the source file for this module will be stdlib_math.fypp (instead of .f90).I would like to work on two issues at a time: this one and 17 (linspace and logspace). Is it possible to work on both at a time under Fortran's policy for GSoC students?
@Aman-Godara Feel free to submit a patch for this issue.
Intro
clamp(x, xmin, xmax)
is a function that clamps (or, clips) an input scalar or arrayx
so that the result is within given boundsxmin
andxmax
. For example:This function is common in standard libraries. I've used it occasionally in Fortran, and I use it often in Python. It provides an easy way to ensure that a variable with a finite (non-NaN, non-Inf) value stays within bounds.
Name
Python has
numpy.clip
, while C++, Julia, and Rust all haveclamp
. Soclamp
is more common. Bothclamp
andclip
are equally meaningful to me. I have slight preference forclip
because it's shorter and has a "sharper" sound to it.What inspired me to open this proposal is reading that
clamp
was stabilized in Rust 1.50.0.API
Alternative names for the bounds could be
low
andhigh
.Implementation
There are a few options that I know of, each of which produce different assembly instructions, and performance depending on optimization levels, array sizes, and whether the clamping is in a loop or over a whole array in a single call.
A min/max clamp
This is the simplest implementation I could think of:
There are two intrinsic function calls here, but maybe a good compiler will optimize one away. I don't know.
An if/then/else clamp
A branchless clamp
This avoids branching but introduces a floating-point error (~
epsilon(x)
) for some elements, even if x is not clipped.What module would this go to?
I don't think we have an appropriate existing module yet, but
stdlib_numeric
orstdlib_math
sound meaningful to me.