Open tmcdonell opened 10 years ago
Going to Generate
definitely loses information wrt optimization, right? If the AST in the future supports "thinning" at some point, why not leave zipWith
in for early optimization passes and lower it after that?
Re: Intersection semantics.... do we have applications that depend on those? APL style toroidal expansion would allow a proper Num
instance for arrays, wouldn't it? What are the advantages of the intersection semantics? Does it help enable more transforms for Trafo?
Also, where exactly is the intersection behavior enforced? Whatever the source semantics of the Accelerate combinators it would be good if there were a point where the any conditional / intersection logic were made explicit in the AST so it could be optimized.
I don't think converting to Generate
limits the kinds of optimisations you can do. If you want to keep it around throughout fusion, then you need a special case just for it, because it is the only producer that has two source arrays. Internally, a ZipWith
skeleton still had to convert a multidimensional index in the destination array into a linear index for each of the source arrays, because they may have been different shapes. So what was implicit in the skeleton of ZipWith
is now lifted out and made explicit as part of the Generate
. If the semantics were that we required the source arrays to be the same shape, then you could do the same linear indexing trick that Map
does (and constant time zip
).
I don't know if we have applications that critically depend on intersection semantics. I just think it matches the Prelude's behaviour of zipWith
on lists. On the other hand, Repa
requires the two sour arrays to be the same extent, else error
.
Trafo explicitly inserts AST terms that compute the intersections.
ghci> A.zipWith (+) xs ys
let a0 = use (Array (Z :. 4 :. 2) [0,1,2,3,4,5,6,7]) in
let a1 = use (Array (Z :. 2 :. 4) [2,3,4,5,6,7,8,9])
in generate (intersect (shape a0) (shape a1)) (\x0 -> (a0!x0) + (a1!x0))
The simplifier does indeed work to remove redundant intersections
ghci> let xs' = A.fill (constant (Z:.4:.4)) (constant 4) :: Acc (Array DIM2 Int)
ghci> let ys' = A.enumFromN (constant (Z:.4:.4)) (constant 0) :: Acc (Array DIM2 Int)
ghci> A.zipWith (+) xs' ys'
generate
(Z :. 4) :. 4
(\x0 -> 4 + (fromIntegral (indexHead (fromIndex (Z :. shapeSize ((Z :. 4) :. 4))
(toIndex (Z :. 4) :. 4 x0)))))
(Although now that I look at this example, this could have been improved somewhat)
Re zipWith
, I think, it is good API design to stick to the semantics used in the Prelude. That's what Haskellers expect.
I personally feel like the intersection semantics of zip in the Prelude is just a source of silent failures / bugs, with the exception of the specific idiom of zip [0..]
. I always waste time replacing zip
with one that expects the same length, which doesn't seem to be worth its own package so gets duplicated in every project.
I do actually make use of zip
's standard semantics, but I think that is secondary. My main point is that semantic changes to an API that a user thinks they understand causes more problems than even a problematic API that meets the users expectations — with a weak reference to the principle of least astonishment.
Accelerate's internal AST has accrued some redundant terms in the AST over its lifetime, which might be good to consolidate.
Replicate
andSlice
are effectively moved to the scalar language viaIndexFull
andIndexSlice
respectively. There may be an argument for leaving these forms in if we can optimise them into amemcpy
, say, but typically we want them to be fused into other operations, hence the scalar varieties.Transform
is the combination ofMap
andBackpermute
. It could probably do with a better name as well.Map
is probably good to keep as it is simpler to implement, and does not require multidimensional indices. Alsounzip*
is expressed in terms ofmap
which makes it relatively easy for a backend to executeunzipN
in constant time (although this is not currently done).ZipWith
could be expressed in terms ofGenerate
, which is what the fusion transform does, but maybe there are advantages to keeping it separate? Because of intersection semantics we can't do the same linear-indexing tricks asMap
, and doesn't allow us to dozipN
in constant time. Having a pathway to support constant timezipN
might be good.