Open FluxusMagna opened 4 years ago
I'm always reluctant to add more nondeterministic constructs. I guess this one could be OK, especially if we can show that the cost of replicate
ing an array is significant for some applications.
I prefer the name permute
, because this operation really is a permutation of the values of the vs
array.
I think the how much it affects performance will depend on hardware. FFTs typically need some form of permutation on each iteration, and since individual calculations are fairly cheap, writing to memory appears to be a somewhat limiting factor, at least on some hardware.
It's only nondeterministic for duplicate and out of bound indices, and scatter would also be in the former case (though in a different way), so a program that is deterministic using scatter
should be deterministic using permute
, unless the program has duplicate index-value pairs or out of bound indices.
For FFTs specifically, this can also be addressed by manual reuse of arrays.
I hadn't considered that, it still seems like a useful function though.
Unless the compiler somehow eliminates it, manual reuse of arrays should result in one unnecessary copy per loop, since the output array is still written to in the beginning. An improvement over no reuse for sure, but permute
would still improve this. Personally I also find iterate
to be 'cleaner' than loop
in most cases.
How about creating a safe permute
?
let permute 't [n] (is:[n]i32) (vs:[n]t) : [n]t =
let permutation = scatter (scratch n) is vs
let full = scatter (replicate n false) is (replicate n true) |> reduce_comm (&&) true
in assert full permutation
The performance would be worse than scatter for most cases, but it would be safer, since it's entirely deterministic, and with #[unsafe]
the performance would improve.
The replicate n false
there is expensive enough to offset the advantage of the hypothetical scratch
. That doesn't mean a safe permute
is not useful, but there is no need to add scratch
just to implement it. The scratch
construct (or something similar) would only make sense for fundamentally unsafe operations, because the cost of safety checking is always going to equal the cost of initialisation.
I realize that it would be normally be slower than a normal scatter
, but wouldn't it skip the assert
, and thus the safety-check with #[unsafe]
? That way, without #[unsafe]
, it would only permit unique indices, which would be useful for validation, and with #[unsafe]
it would be faster than initializing an array and scatter
ing to it. The #[unsafe]
variant would be truly unsafe, and would benefit from a scratch
construct. Perhaps I am misunderstanding how assert
works with #[unsafe]
.
Yeah, I guess with #[unsafe]
dead code elimination would remove the inefficient replicates. That might not be a bad implementation.
Sometimes when using
scatter
the programmer knows that all indices are scattered to. Therefore no values from the initial array are present anymore. The compiler does not know this however, and the code will still generate whatever array is specified as initial value. This is somewhat wasteful, and performance could be improved by scattering to memory with unspecified values, as given by thescratch
construct, present in the IR.I think for most cases where the programmer knows that each index is included, there is exactly one instance of each index. A function implementing this could then have a type like
The danger with this is of course that if not all indices are written to, whatever was in memory previously will be left, making it nondeterministic.
scatter
is however already nondeterministic for duplicate indices, so I think this is a relatively small issue.