haskell / vector

An efficient implementation of Int-indexed arrays (both mutable and immutable), with a powerful loop optimisation framework .
Other
360 stars 141 forks source link

Need a Boxed instance for Unboxed (to be able to unbox user data types with some boxed fields) #503

Open dcoutts opened 2 hours ago

dcoutts commented 2 hours ago

It may sound counter-intuitive, but it would be really useful for unboxing structures that contain some boxed data.

For example, the docs for As give this little example record data type:

data Foo a = Foo Int a
  deriving Show

It defines a relation between this and a tuple (for which there is an existing Unbox instance)

instance VU.IsoUnbox (Foo a) (Int,a) where
  toURepr (Foo i a) = (i,a)
  fromURepr (i,a) = Foo i a

and end up defining an instance

instance VU.Unbox a => VU.Unbox (Foo a)

But this requires the a to be unboxable too. What if I have a record member that really does need to be boxed, such as a reference, another array, a string, etc etc etc.

I should instead be able to define the relation like this

instance VU.IsoUnbox (Foo a) (Int, Boxed a) where
  toURepr (Foo i a) = (i,a)
  fromURepr (i,a) = Foo i a

where Boxed is a provided newtype that is an instance of Unbox already, and thus the tuple (Int, Boxed a) is therefore an instance of Unbox.

Obviously, the representation for a vector of Boxed is just an ordinary boxed vector!

newtype Boxed a = Boxed a

newtype instance VU.MVector s (Boxed a) = MV_Boxed (V.MVector s a)
newtype instance VU.Vector    (Boxed a) = V_Boxed  (V.Vector    a)

instance VGM.MVector VUM.MVector (Boxed a) where
  basicLength (MV_Boxed v) = VGM.basicLength v
  -- ... etc, for all the methods

instance VG.Vector VU.Vector (Boxed a) where
  basicUnsafeFreeze (MV_Boxed v) = V_Boxed <$> VG.basicUnsafeFreeze v
  -- ... etc, for all the methods

instance VU.Unbox (Boxed a)

And that's it. Now we can unbox any record, even if some fields are still boxes.

The above code type checks, it's just missing the impl for all the methods, but that's just doing the obvious boring thing. A PR would be easy.

dcoutts commented 2 hours ago

Related: it'd also be nice to have specific Unbox instances for types like ByteArray and MutableByteArray that are (morally) unlifed but boxed. This could take advantage of recent GHC's ability to have levity-polymorphic arrays (which ideally should be exposed via Primitive.Array).

And further refinements: we could also have BoxedStrict for cases where we want a vector of boxed data, but want the values in the array to be always evaluated to WHNF. Currently it's actually quite hard to reliably have vectors of boxed but WHNF data.

Shimuuar commented 1 hour ago

Yes. I would say that Unbox is misnomer. It about selecting representation of an array by element type. And it need not to be necessarily unboxed.

There's also precedent UnboxViaPrim uses primitive vector as a representation. It could be used as a template. I would gladly accept PR with such addition. Only question is naming. I think it would be nice to be explicit about strictness of underlying array.

I can see two possible names: AsLazyBoxed/AsStrictBoxed using As as reference. Or UnboxVia{Strict,Lazy}Boxed which is based on UnboxViaPrim but it's long and weird. So former it probably better