haskell / primitive

This package provides various primitive memory-related operations.
Other
114 stars 58 forks source link

Add `sizeOfType`, `alignmentOfType` #371

Closed konsumlamm closed 1 year ago

konsumlamm commented 1 year ago

Closes #344.

This adds sizeOfType and alignmentOfType "methods" to the Prim class. They're Int values, since class dictionaries apparently can't contain unboxed values, but I'm doubtful that Int# would bring any advantages anyway. I chose the OfType suffix to indicate that they're about the size/alignment of a type and that they need a type argument (via TypeApplications). I also added doc comments to encourage the use of these new values.

The PR is marked as draft, since I first want to get the new definitions reviewed, before changing uses of sizeOf/sizeOf# and alignment/alignment# throughout the package.

EDIT: As per @lehins suggestion, I added sizeOfType# and alignmentOfType# methods instead and made sizeOfType and alignmentOfType standalone values.

treeowl commented 1 year ago

When we get visible dependent quantification in terms, we'll want a function

sot :: forall a -> Prim a => Int

Let's think carefully about what the method and the function should be called so we don't get stuck.

konsumlamm commented 1 year ago

When we get visible dependent quantification in terms, we'll want a function

sot :: forall a -> Prim a => Int

I'm not sure that would really be a good idea. Even if we get forall a -> in the near future (which is not clear afaik), we'd still want to support older GHC versions that don't have this, so it would take a very long time to get this, and even longer for it to become the default.

konsumlamm commented 1 year ago

Hmm, it seems prior to GHC 8.8, deriving couldn't handle ambiguous types (https://gitlab.haskell.org/ghc/ghc/-/issues/15637)... The only workaround I can think of is writing a CPP macro for that instead.

konsumlamm commented 1 year ago

When we get visible dependent quantification in terms, we'll want a function

sot :: forall a -> Prim a => Int

I'm not sure that would really be a good idea. Even if we get forall a -> in the near future (which is not clear afaik), we'd still want to support older GHC versions that don't have this, so it would take a very long time to get this, and even longer for it to become the default.

Would it even be possible to define such a function in a typeclass or would it have to be standalone (that would already make it inferior imo)?

lehins commented 1 year ago

The most sensible solution IMHO is to add these two to the Prim type class:

  sizeOfType# :: Proxy# a -> Int#
  sizeOfType# _ = sizeOf# (dummy @a)

  alignmentOfType# :: Proxy# a -> Int#
  alignmentOfType# _ = alignment# (dummy @a)

And then they can be used to define standalone functions:

sizeOfType :: forall a. Prim a => Int 
sizeOfType = I# (sizeOfType# (proxy# :: Proxy# a))

alignmentOfType :: forall a. Prim a -> Int
alignmentOfType = I# (alignmentOfType# (proxy# :: Proxy# a))
konsumlamm commented 1 year ago

@lehins while I'm fine with your suggestion, do you think there really are cases where returning an Int# is better? It would however solve the problem of not being able to use deriving instance, which would be nice.

lehins commented 1 year ago

do you think there really are cases where returning an Int# is better

Yes. In fact there are many reasons why this is better:

  1. Easier migration from sizeOf#/alignment#
  2. There are places in primitive where we use unboxed Int# directly with sizeOf# and alignment# (see ByteArray module). If we were to produce a boxed Int instead then in all such places mainatiners and users of primitive would be forced to either manually pattern match to unwrap the boxed Int or continue using the older variants.
  3. There will be helper functions that produce boxed Int that can be used conveniently with TypeApplications, so the inconvenience of manual wrapping with I# goes away for the most common case.

If we were to place sizeOfType :: forall a. Prim a => Int inside the Prim type class:

  1. Deriving Prim would become problematic for older ghc (as you already pointed out)
  2. Creating custom Prim instances would require AllowAmbiguousTypes. That is one more extension than before and it is not exactly one of those universally loved extensions either. So, I advise against forcing it on the user. Note that you don't need that extension in order to use sizeOfType and alignmentOfType, so by keeping them outside of Prim type class we keep usage of that extension locally to Data.Primitive.Types only.
konsumlamm commented 1 year ago

Ok, so it seems using Proxy# a (or proxy a) doesn't work either for GHC <= 8.8 (see https://gitlab.haskell.org/ghc/ghc/-/issues/16293). One alternative is using Proxy instead (and possibly incur a little overhead) or to define our own Proxy# (with a phantom type parameter, however this would ofc be incompatible with the standard Proxy#) (nevermind, this would require UnliftedNewtypes). Another alternative is to keep Prim as it is and only add the standalone "functions" sizeOfType and alignmentOfType.

konsumlamm commented 1 year ago

I decided to go with Proxy now, as that's still better than undefined imo (and no more overhead). What do you think?

konsumlamm commented 1 year ago

I replaced all occurences of sizeOf#/alignment#/sizeOf/alignmentOf by sizeOfType#/alignmentOfType#/sizeOfType/alignmentOfType respectively. This massively reduced the core output (by avoiding a lot of SrcLocs and CallStacks).

chessai commented 1 year ago

I decided to go with Proxy now, as that's still better than undefined imo (and no more overhead). What do you think?

Why not Proxy#?

lehins commented 1 year ago

Why not Proxy#?

@chessai https://github.com/haskell/primitive/pull/371#issuecomment-1453494906

chessai commented 1 year ago

Why not Proxy#?

@chessai https://github.com/haskell/primitive/pull/371#issuecomment-1453494906

Thanks, I missed that.

Some thoughts: With 9.6 released we're getting close to the point that 8.8 is getting old (coming up on four years). Core libraries began to phase out GHC 7.x support around the time GHC 9 dropped. We'd want to update from Proxy to Proxy# in the future, and since we're just adding these now, introducing it as Proxy# seems better for GHCs where isn't buggy. So I'm more in favour of using CPP to use Proxy for base < 4.14 and Proxy# otherwise, with some documentation that this is the case.

lehins commented 1 year ago

I'm more in favour of using CPP to use Proxy for base < 4.14 and Proxy# otherwise

I think that having an API in a core library that differs with ghc version is a horrible idea, because it also forces users of primitive to use CPP as well. Moreover, if support for multiple versions of primitive is needed then CPP is required for both the library version and the ghc version. It's just a mess that is not worth the benefits that Proxy# brings to the table.

This is not an optimization that will result in a noticeable performance improvement. So, let's not bring to much burden on the users.

I suggest we migrate to Proxy# only when we officially drop support for GHC-8.8, which I doubt is going to happen for another few years (we barely dropped 7.10 last year)

In order to make transition slightly smoother, once we finally reach it, we could create standalone functions now with expected type signatures, that will eventually travel into the Prim type class:

sizeOfProxy# :: Prim a => Proxy# a -> Int#
sizeOfProxy# _ = sizeOfType# (Proxy @a)

alignmentOfProxy# :: Prim a => Proxy# a -> Int#
alignmentOfProxy# _ = alignmentOfType# (Proxy @a)
chessai commented 1 year ago

I think that having an API in a core library that differs with ghc version is a horrible idea, because it also forces users of primitive to use CPP as well. Moreover, if support for multiple versions of primitive is needed then CPP is required for both the library version and the ghc version. It's just a mess that is not worth the benefits that Proxy# brings to the table.

Good point, I agree now that we should just go with Proxy over Proxy#.

I suggest we migrate to Proxy# only when we officially drop support for GHC-8.8, which I doubt is going to happen for another few years (we barely dropped 7.10 last year)

I will add that the reason 7.10 was dropped so late was because core libraries were unmaintained for so long, not because it was decided that it was the best time to do so (rather that it was overdue). But migrating to Proxy# once GHC-8.8 support is dropped seems the least burden on users with wider support windows/un-upgraded compilers.

In order to make transition slightly smoother, once we finally reach it, we could create standalone functions now with expected type signatures, that will eventually travel into the Prim type class:

sizeOfProxy# :: Prim a => Proxy# a -> Int#
sizeOfProxy# _ = sizeOfType# (Proxy @a)

alignmentOfProxy# :: Prim a => Proxy# a -> Int#
alignmentOfProxy# _ = alignmentOfType# (Proxy @a)

That sounds like a good idea to me. Names seem fine too.

andrewthad commented 1 year ago

This looks good. My only request is the we change from this:

-- | A dummy value of type @a@.
dummy :: a
dummy = errorWithoutStackTrace "dummy"

to this:

-- | A dummy value of type @a@.
dummy :: a
{-# noinline dummy #-}
dummy = errorWithoutStackTrace "Data.Primitive.Types: implementation mistake"

And then if something ever goes badly wrong, the end user gets a much better clue about what is wrong.

konsumlamm commented 1 year ago

I changed the error message to Data.Primitive.Types: implementation mistake in `Prim` instance, since dummy only gets evaluated if there is a wrong implementation of a Prim instance. The NOINLINE pragma was already there.