lehins / massiv

Efficient Haskell Arrays featuring Parallel computation
BSD 3-Clause "New" or "Revised" License
384 stars 23 forks source link

iZipWith4 #105

Closed adamConnerSax closed 4 years ago

adamConnerSax commented 4 years ago

I've come across a need for zipWith4. I'm happy to copy/paste iZipWith3 and add a dimension, etc. But it involves hidden modules and then I'm not sure what to do. Any plans to add or quick way I can write it without the hidden modules?

Thanks!

lehins commented 4 years ago

You can define larger zips with existing functions, no need to drop to the dangerous internals, eg:

zipWith4 f a1 a2 a3 a4 = zipWith (\(e1, e2) (e3, e4) ->  f e1 e2 e3 e4) (zip a1 a2) (zip a3 a4)

That said, I'll add a set of function for four arrays in the next release: d47d30c88768d52752937640d827c94ea10d1863

adamConnerSax commented 4 years ago

Thanks! Will the above version of zipWith4 be comparably fast because zipping operates via Delayed arrays?

This is my first time using Massiv and one tricky bit is trying to understand how the different array representations, particularly Delayed vs Manifest, matter when doing various things. I had sort of assumed that I should want everything (the results of various functions) to be polymorphic in representation or, if I had to choose, Delayed. But then the indexing and slicing operations keep requiring things to be Manifest. And I can't tell if that means that I am doing something inefficiently.

I am really enjoying the library! Even the confusing bits are interesting to think about.

lehins commented 4 years ago

Will the above version of zipWith4 be comparably fast because zipping operates via Delayed arrays?

Yes precisely. Delayed arrays allow to fuse computation together and ghc is pretty good at eliminating intermediate tuples.

particularly Delayed vs Manifest

If I had to explain the difference in a succinct way: delayed array is just a function that describes how a manifest array can be created.

Rule of thumb is: if an array is used as input to more than one function, it should not be delayed, because it will cause it to be evaluated more than once.

the indexing and slicing operations keep requiring things to be Manifest

Indexing into delayed array is prevented on purpose, because it doesn't exist in memory yet. There are escape hatched of course like evaluate' and unsafeIndex, but I would only recommend using them when you understand the implications. Slicing on the other hand works fine on D representation, because slicing operation is just index manipulation

I can't tell if that means that I am doing something inefficiently.

If it doesn't compile, it's likely that you are trying to do something inefficiently, but the type system is guarding you from making a mistake :slightly_smiling_face:

I am really enjoying the library!

Really glad to hear that! :smiley:

adamConnerSax commented 4 years ago

Thanks again! If I may, one more question: I frequently have a situation where I want to treat a matrix (so Array r Ix2 e) as a Vector of Vectors, so Array r Ix1 (Array r Ix1 e). Is there an idiomatic way to do this? I can do a combination of makeArray and outerSlice but is there something more direct or preferable?

Here's my current try:

asVectors :: (MA.Load r MA.Ix2 e
              , MA.Construct r MA.Ix1 (MA.Vector (MA.R r) e)
              , MA.OuterSlice r MA.Ix2 e
              )
           => MA.Matrix r e -> MA.Vector r (MA.Vector (MA.R r) e)
asVectors m =
  let MA.Sz2 r _ = MA.size m
  in MA.makeArray MA.Seq (MA.Sz1 r) $ \r -> (m MA.!> r)

-- The issue here is that the vectors might not all be the same size.  So we give a size and raise and exception (??)
-- if we're wrong
asMatrix :: (MA.Manifest r MA.Ix1 (MA.Array r' MA.Ix1 e)
            , MA.OuterSlice r' MA.Ix1 e
            ) => Int -> MA.Vector r (MA.Vector r' e) -> MA.Matrix MA.D e 
asMatrix cols vs = MA.expandWithin MA.Dim1 (MA.Sz1 cols) (\v j -> v MA.!> j) vs

Which is fine, maybe? But I can see here that I am forced to have that Manifest constraint and I wonder if that is a sign that there is a better way?

lehins commented 4 years ago

Is there an idiomatic way to do this?

Not really, array of arrays is the only way to describe ragged arrays for now in massiv. On the other hand, if your inner vectors are all of the same size, it is more efficient to keep them in a 2D matrix. Slicing a matrix to get an array of rows though is the right approach here.

I can do a combination of makeArray and outerSlice but is there something more direct or preferable?

Both function you implemented are absolutely correct and idiomatic. It is a second day in a row that I was asked about asVectors sort of functionality, which means it is time to add it to massiv. With regards to asMatrix there is a better way to do it, but it involves usage of unsafe functions, so currently there is no functionality for it. But again, this was also the functionality that @ocramz was talking to me yesterday about on gitter, so it seems like I need to start thinking on a general implementation for it and adding it to massiv API.

You can always catch me on gitter if you have questions: https://gitter.im/haskell-massiv/Lobby I prefer to keep issue tracker for issues. See my last message that has both of asVectors and asMatrix like functionality in it.

lehins commented 4 years ago

@adamConnerSax Totally forgot, you can implement asMatrix with concatM function

adamConnerSax commented 4 years ago

Got it! Didn't realize that was the better way. Will go there with questions from now on.