Closed dacarnazzola closed 1 year ago
As for guarding against
u == 0
, why not just do something like this:
For guarding against u==0
, I usually do
call random_number(u)
u = 1 - u
because random_number
will return a value in the range 0<=u <1
As for guarding against
u == 0
, why not just do something like this:u = max(tiny(u), u)
which is much simpler than the doubly-nested
do
loop.
That is indeed much simpler; performance should be tested. It may be significantly faster too!
As far as not using assumed size arrays, that is a matter of preference and if this project does not want to use them, then I would agree impure elemental subroutine
at least meets the need of reducing code duplication.
I have went ahead and implemented u(1) = 1.0 - u(1)
instead of the while loop to keep u > 0.0
. On my machine, this measured ever so slightly faster than u = max(tiny(1.0), u)
for a large array. I suspect for a size of 2 there will be no measurable difference.
The procedure is also now impure elemental subroutine
as requested. I wasn't sure what you guys like to do with dead code, so I just commented out the unused module function
s from nf_random
, but left the actual implementations in the submodule alone.
EDIT: Turns out this approach runs rather slower than the previous method on my machine when compiled with gfortran
. Using ifort
brings it in line with the other methods, but this is a common occurrence I would say. Assumed size arrays seem to generate the fastest code in a lot of cases based on my experience.
Great, thank you! You can remove the old functions from the code since the new subroutine will be used instead and we have git to keep history.
Great @dacarnazzola let's just rename randn_ele
to random_normal
(I don't know why I called it randn
it was a long time ago, maybe after np.randn
), and this is good to go.
Alright, all fair points. I have updated the fix to only provide one value at a time, keeping no internal state. In response to a couple of questions/comments
u(1)
needs to be /= 0.0
because log(0.0)
is invalid, but cos(0.0)
is not.impure elemental
should indicate that this routine is not safe to parallelize.r
when only calculating a single value.If my comment about impure elemental
is incorrect, then I wonder why there was ever a requirement on elemental
to be pure
, which was later lifted to include procedures marked impure
. I will note that in general, I have never seen a procedure that compiled to anything better, or executed faster, after being marked pure
or elemental
, but I would absolutely love to be shown a counter example.
Great @dacarnazzola let's just rename
randn_ele
torandom_normal
(I don't know why I called itrandn
it was a long time ago, maybe afternp.randn
), and this is good to go.
Do you want the interface
renamed to random_normal
, just randn_ele
, or both?
I will note that in general, I have never seen a procedure that compiled to anything better, or executed faster, after being marked pure or elemental, but I would absolutely love to be shown a counter example.
Same here, but that may change as compilers and hardware improve.
Do you want the interface renamed to random_normal, just randn_ele, or both?
Since we now have just one specific procedure, I'd remove the interface altogether.
Do you want the interface renamed to random_normal, just randn_ele, or both?
Since we now have just one specific procedure, I'd remove the interface altogether.
This has been done. I don't have a lot of experience with submodules, but the compiler wasn't liking the main nf_random
module not having an interface
block. Since other modules seem ok without a corresponding submodule, I have removed nf_random_submodule
.
This fix also allows a single
randn_vec
subroutine to be used for any array rank, as well as improves performance ~3-10x. See example program below for that: