byteverse / contiguous

Typeclass for array types
Other
18 stars 8 forks source link

Explicitly mark functions as unsafe and add safe versions #49

Open oberblastmeister opened 2 years ago

oberblastmeister commented 2 years ago

I really like this library, and would like to use it in projects. One thing that is stopping me is that safe and unsafe functions are mixed together. The juxtaposition of high level and low level unsafe functions confuses me. If this library wanted to include a typeclass around primitive, it would make sense. I think that it would be better to add an unsafe prefix or put unsafe functions in a separate module, then add safe versions. What are your thoughts on this?

chessai commented 1 year ago

I'm sympathetic to this, but I'm also not sure about making such a change. This library is very similar in spirit to primitive, in the saying "users are expected to know what they're doing". At the same time, there's more toolbox safe/high-level functions here than in primitive. Another point against this change is that the safety of a function is supposed to be super obvious from its type here: If you have e.g. (making up code) get :: Int -> Array a -> a and tryGet :: Int -> Array a -> Maybe a, the understanding of the former's (get's) unsafety flows smoothly out from understanding what Array is; though this may be classic Haskeller's bias with types vs naming/documentation. I do a lot of unification in my head when reading code. Array programming mental unification is often easy enough that I can understand which variant is being used if I happen to forget based on the name. Haskell already makes array programming really annoying/syntactically noisy, so prefixing "unsafe" to everything I'll be doing often feels bad.

I think at the very minimum this should be thoroughly documented. I'm mildly +1 on making some kind of change to improve the names of things to make their safety clearer, because names are still really helpful.

konsumlamm commented 1 year ago

It should definitely be documented which functions are safe and which aren't. Currently, the only indication of that is that it uses the primitive arrays, but not everyone is familiar with primitive. And even if people are familiar with primitive, they might assume that contiguous provides a safe (as in UB free) interface around it. For example, when I see get :: Int -> Array a -> a (outside of primitive), my first assumption is that it throws an exception if the index is out of bounds, not that it results in UB.

oberblastmeister commented 1 year ago

I don't think it is reasonable at all to assume UB from a type signature. Well-typed programs should not go wrong, and should at least not cause UB. I also assume that get :: Int -> Array a -> a will throw an exception and not result in UB.

I think we can make the api safe without breaking it. If we add bounds checking to all the currently unsafe functions, the api remains the same. Then, we can add unsafe variants, but this also doesn't break the api.

chessai commented 1 year ago

Good point about UB vs Exceptions. Primitive itself doesn't work like this (just UB everywhere) but it definitely violates most Haskeller's understanding. If we want to make primitive operations more approachable (one of the very goals of this library as a unification of array APIs), then we should look at improving on this too.

So, concretely, for every relevant function, having three versions:

along with good documentation for everything so users fully understand the breadth of consequences of their choice

chessai commented 1 year ago

FWIW I've been attracted to exception types such as the two provided by the following library: https://github.com/riz0id/array-exceptions

chessai commented 1 year ago

I'm sometimes against the name X being given to the one which throws the exception and not the "totally" safe version (which i referred to as tryX), such as when i use vector. But I'll need to think about that later.

konsumlamm commented 1 year ago

Some prior art:

chessai commented 1 year ago

Some prior art:

  • vector uses ! for indexing with exception and !? for indexing with Maybe
  • Data.Sequence uses index for indexing with exception and !?/lookup for indexing with Maybe

This deserves mention, thanks. Although I'm not a fan, I don't know of better operator name alternatives.

oberblastmeister commented 1 year ago

@andrewthad any thoughts on this?

andrewthad commented 1 year ago

I much prefer having separate modules to prefixing the function names. I think the easiest backwards compatible way to do this would just be to introduce a safe module and an unsafe module and re-export appropriate functions from each.