Implementation of heterogenous arrays

gadmm commented 11 months ago

I had a quick look at the code (not a full review yet).

I saw that you use Obj.magic to implement a heterogenous array using an OCaml array.

Do we know if this is sound? I am thinking about subtleties with the flat-float-array representation, but there may be other corner cases of this kind.

I was also wondering if there was work of art one could take inspiration from, or depend on, about implementing heterogenous arrays, that would have already dealt with this question.

c-cube commented 11 months ago

On Fri, 27 Oct 2023, Guillaume Munch-Maccagnoni wrote:

I saw that you use Obj.magic to implement a heterogenous array using an OCaml array.

This code is originally from Vesa, I only refactored it. However I do similar things in containers; I think it's safe as long as we only store normal values (e.g. references). The docs should definitely state not to store floats in TLS!!

Do we know if this is sound? I am thinking about subtleties with the flat-float-array representation, but there may be other corner cases of this kind.

I was also wondering if there was work of art one could take inspiration from, or depend on, about implementing heterogenous arrays, that would have already dealt with this question.

We could use a map/hashmap with universal types and so on, it's just more costly.

gadmm commented 11 months ago

This code is originally from Vesa, I only refactored it. However I do similar things in containers; I think it's safe as long as we only store normal values (e.g. references). The docs should definitely state not to store floats in TLS!!

It's not clear that there are any issues with floats, do you have something specific in mind that makes you confident there is indeed an issue?

I am not so familiar with this optimisation, here is what I understand so far. It is not certain to me yet that there is an issue, but it is worth investigating. Regarding the runtime code (e.g. C functions caml_arrayunsafe*), the magic only happens if the tag of the array is special; it is never allocated with a special tag here, so no issues. Remains to look at the codegen parts for native (probably the part that specialises the accessors according to the known array kind). I also wonder if things could go wrong after inlining the functions that call unsafe_get/set.

I could request more details from Coq devs, from whom I have heard horror stories about the combination of unsafe and the flat float array optim.

We could use a map/hashmap with universal types and so on, it's just more costly.

It would be nice to keep using arrays indeed, since performance is an essential point.

gadmm commented 11 months ago

The Coq devs point me to the following two issues if we want to dig further: https://github.com/coq/coq/issues/15070 https://github.com/coq/coq/issues/17871

The first one claims that is it enough to make sure that the array is not allocated with the special tag (done correctly here); but I still have my questions about codegen and inlining

c-cube commented 11 months ago

Fwiw I've done similar things in containers for dynamic arrays. However it's a bit different in that the array is only initialized once we have an element, and because it's homogeneous we then use this first element to know whether to allocate a float array or regular array. Never had segfaults with it.

For TLS I'd personally be fine with a runtime assertion that checks that the data is not a float (it's better to use a float ref or a record anyway! If you want to mutate it later), if it's safer.

gadmm commented 11 months ago

I agree that it is in principle different for heterogenous arrays. But I think the answer is given in the implementation of Domain.DLS:

  let set (idx, _init) x =
    let st = maybe_grow idx in
    (* [Sys.opaque_identity] ensures that flambda does not look at the type of
     * [x], which may be a [float] and conclude that the [st] is a float array.
     * We do not want OCaml's float array optimisation kicking in here. *)
    st.(idx) <- Obj.repr (Sys.opaque_identity x)

The implementation of get and set are roughly equivalent to DLS.get and DLS.set in this aspect. I'll let you double check but I think it is fine as it is. (And perhaps it deserves a comment in the code like the one above.)

c-cube / thread-local-storage

Implementation of heterogenous arrays #1