Open StephenVavasis opened 10 years ago
I think a good fourth option is features like @simd
that allow certain assumptions within a block of code. Future features like this could also automatically insert checks for overlap. Since our arrays carry size information they can be checked for overlap at runtime, allowing a "trust but verify" approach.
Does LLVM provide infrastructure for making optimizations that take into account that memory is not aliased?
Yes, it has various support for alias analysis and the related metadata. @ArchRobison began using it in our codegen. We already have a lot of alias information available; for example two julia objects (except the data part of arrays) never overlap. So the situation is not quite like C, where almost anything can alias anything else.
Interesting. Your idea to make this in a way like the @simd
(or @inbounds
) macro sounds good.
Although its not all that clear how much one would gain by this feature.
Also worth noting that this is not just a matter for the compiler. In the case of matmul(A,B,C)
, an implementation might have some behavior the compiler should obey if the arguments alias each other, but the author of the function still might not intend for it to be called that way. In that case an error check is warranted, so it would make sense to have a @noalias (A,B,C) begin ... end
that both does the check and uses the information in the compiler.
Possibly awful idea... Allow dispatch on the same objects as arguments via matmul(A,B,A)
?
Indeed. If I understand it correctly this is about optimizing "scalar" code that can use fewer instructions if certain registers don't have to be backup. In practice, efficient caching might have a much larger impact on performance. Better not talking about all that temporaries we generate on vectorized expressions...
I am embarrassed that I just realized that my one and only Julia project so far, namely a library of containers based on balanced trees submitted for possible inclusion in DataStructures.jl, has an undocumented no-aliasing assumption in the merge!(A,B), union!(A,B) and setdiff!(A,B) routines (i.e., the routines may fail if called as merge!(A,A), union!(A,A) or setdiff!(A,A)). Obviously, I will document the no-aliasing assumption now that I realized that I had unwittingly made it. But whatever you guys decide about aliasing, I would like there to be a sentence or two about function argument aliasing in the next edition of the manual so that at least readers of the manual can start thinking about it. For example, you could create Jeff's @noaliasing macro as a no-op placeholder for now and mention in the manual that future versions of Julia will both enforce it and take advantage of it. Does setdiff!(A,B) work correctly if A==B and both are IntSets? I should look at the source code to figure this out.
An @noaliasing
macro could be useful for enforcing constraints, though then it must not be a no-op. It must do the checking. Even an explicit check and knowledge about Julia arrays might be enough. E.g., given:
a===b && oops("ouch")
...
we could have a custom LLVM pass synthesize the LLVM noalias metadata on the loads and stores involving a
and b
inside ...
.
However, I wouldn't get wound up over noalias for optimization purposes, because the heavy hitter optimizations such as blocking matrix multiplication require more liberties to proceed:
For those reasons, I'd rather have an unordered-for loop construct, so I can clearly convey to compiler and maintainer that "I really don't care about the iteration order". (Aside: note that nested unordered loops imply stronger constraints than unordered iteration over a multidimensional space.) I'm not sure how to use such with LLVM, but maybe these high level optimizations belong at a higher level anyway.
On the subject of containers, I'd prefer that the container operations handle the aliased case correctly. E.g., union!(A,A)
should be a noop. In analogous cases in C++ codes, often there is an alias check in operator= for reference counted classes, to handle the reflexive case correctly.
I pondered Stefan's idea. Alas it would seem to require many overloads to exclude the troublesome cases. For example, consider a matmul
where we want the fast version for the case where the output matrix does not alias the input matrices. There would have to be four definitions:
This also brings up an issue with designing a @noalias
notation. Typically we want to ensure that outputs are not aliased with each other or with other inputs, but we don't care whether inputs are aliased. The @noslias
notation should somehow express these constraints without over constraining.
Perhaps it could be ok to have just two definitions, one fast with no aliasing and one defensive when there is aliasing between at least two of the variables?
Oh sorry, now I realize that we are talking about overloading. Yeah, I agree. Actually, with overloading, things are kind of backwards compared to how you want it in this case: (A, B, C) is the most general case and would act as a fallback, but what we are interested in here is to have a version of it that applies only when none of the three are the same (or aliased in some other way).
Focusing too much on A === B
is not good, since arrays A
and B
can overlap without them being the same object.
I see now that I missed Jeff's qualification "(except the data part of arrays)". We'd need to provide a function for overlap testing and that a custom LLVM pass to exploit it. I was thinking about the pass some more and it seems that refining the TBAA information, not using noalias
, is the right solution. Currently all the loads/stores of the data part are marked with the same TBAA node. The custom pass could mark the loads/stores with children of that node, between the no-overlap check and anything that might repoint the data part.
Does Julia have a function that determines if two arrays overlap?
An @noaliasing macro could be useful for enforcing constraints, though then it must not be a no-op. It must do the checking.
Why would it have to do checking? I would think it would be useful for things like @dextorious' https://github.com/JuliaLang/julia/issues/21966 where you want to give the compiler the right to optimize given that you know the values will not alias. I see it as a local no aliasing scope to opt in for performance, like @inbounds
.
There's now techniques used in Base to address this, and other issues open for further improvements (e.g. https://github.com/JuliaLang/julia/issues/19658):
help?> Base.unalias
Base.unalias(dest, A)
Return either A or a copy of A in a rough effort to prevent modifications to dest from
affecting the returned object. No guarantees are provided.
Custom arrays that wrap or use fields containing arrays that might alias against other
external objects should provide a Base.dataids implementation.
This function must return an object of exactly the same type as A for performance and type
stability. Mutable custom arrays for which copy(A) is not typeof(A) should provide a
Base.unaliascopy implementation.
See also Base.mightalias.
help?> Base.mightalias
Base.mightalias(A::AbstractArray, B::AbstractArray)
Perform a conservative test to check if arrays A and B might share the same memory.
By default, this simply checks if either of the arrays reference the same memory regions, as
identified by their Base.dataids.
IIUC Julia doesn't have great support for aliasing safety. Nor do we have a consistent norm for how those tools should be used. I'd love to see more improvement here.
IIUC Rust deals with this quite well, bu it would be very difficult for us to adopt their system.
This is a long-term issue that has arisen out of my recent discussion with Jameson Nash on the julia users group: I would like to suggest that the core Julia developers come up with a strategy or roadmap for dealing with aliasing.
In more detail, consider a code for matrix multiplication C=A*B with the calling format matmul(A,B,C) that is implemented in the old-fashioned form of three nested loops. It is well known by now that performance boosts of huge factors-- 10 or more -- are possible by reordering the loops, blocking them, etc. Indeed, this was the whole impetus behind the LAPACK project of the 1990s. Furthermore, many papers in the compiler community explain how a compiler can automatically transform three nested loops into high-performance code.
However, if the compiler does not know that the user will never invoke the routine as matmul(A,B,A), then it cannot carry out most of these transformations because most of them would change the answer in the (unexpected) case that C and A are identical.
Currently, the Julia manual says nothing at all about aliasing among function arguments.
For this and probably other examples, it could be useful for the compiler to know that the arguments to a function will not be aliased. What is Julia's strategy in this regard? I can think of a few possibilities. As I am not a compiler person myself, I don't have a strong preference for which possibility should be followed, but I think there should be a strategy.