owlbarn / owl

Owl - OCaml Scientific Computing @ https://ocaml.xyz
MIT License
1.22k stars 124 forks source link

OCaml 5.2 support #657

Closed kit-ty-kate closed 1 month ago

kit-ty-kate commented 9 months ago

OCaml 5.2 added support for float16 bigarrays and owl-base fails with:

#=== ERROR while compiling owl-base.1.1 =======================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-variants.5.2.0+trunk | file:///home/opam/opam-repository
# path                 ~/.opam/5.2/.opam-switch/build/owl-base.1.1
# command              ~/.opam/5.2/bin/dune build -p owl-base -j 1
# exit-code            1
# env-file             ~/.opam/log/owl-base-19-47ea7a.env
# output-file          ~/.opam/log/owl-base-19-47ea7a.out
### output ###
# (cd _build/default && /home/opam/.opam/5.2/bin/ocamlc.opt -w -40 -g -bin-annot -I src/base/.owl_base.objs/byte -I /home/opam/.opam/5.2/lib/ocaml/unix -no-alias-deps -o src/base/.owl_base.objs/byte/owl_utils_ndarray.cmo -c -impl src/base/owl_utils_ndarray.ml)
# File "src/base/misc/owl_utils_ndarray.ml", lines 9-22, characters 56-87:
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# Float16
# 
# File "src/base/misc/owl_utils_ndarray.ml", lines 26-41, characters 56-86:
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# Float16
# (cd _build/default && /home/opam/.opam/5.2/bin/ocamlc.opt -w -40 -g -bin-annot -I src/base/.owl_base.objs/byte -I /home/opam/.opam/5.2/lib/ocaml/unix -intf-suffix .ml -no-alias-deps -o src/base/.owl_base.objs/byte/owl_const.cmo -c -impl src/base/owl_const.ml)
# File "src/base/core/owl_const.ml", lines 48-61, characters 40-28:
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# Float16
# 
# File "src/base/core/owl_const.ml", lines 64-77, characters 39-28:
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# Float16
# 
# File "src/base/core/owl_const.ml", lines 80-93, characters 43-28:
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# Float16
# (cd _build/default && /home/opam/.opam/5.2/bin/ocamlopt.opt -w -40 -g -I src/base/.owl_base.objs/byte -I src/base/.owl_base.objs/native -I /home/opam/.opam/5.2/lib/ocaml/unix -intf-suffix .ml -no-alias-deps -o src/base/.owl_base.objs/native/owl_const.cmx -c -impl src/base/owl_const.ml)
# File "src/base/core/owl_const.ml", lines 48-61, characters 40-28:
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# Float16
# 
# File "src/base/core/owl_const.ml", lines 64-77, characters 39-28:
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# Float16
# 
# File "src/base/core/owl_const.ml", lines 80-93, characters 43-28:
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# Float16
tmcgilchrist commented 6 months ago

In addition to the missing Float16 pattern matches the C code has warnings from the K&R C style function declarations and incompatible pointer types.

https://github.com/tmcgilchrist/owl/tree/ocaml_5_2 has basic fix for pattern matches in ef69ebe19f7e3b5ff1e51453999171e2d4b32ced and silenced K&R C warnings in c3a3582e1c728b225950c253816ec7c8bfee0fcf

Haven't looked into the last warnings about incompatible pointer types.

opam exec -- dune build @all
In file included from src/owl/core/owl_ndarray_contract_stub.c:17:
src/owl/core/owl_ndarray_contract_impl.h:88:9: warning: incompatible pointer types assigning to 'int64_t *' (aka 'long long *') from 'intnat[]' (aka 'long[]') [-Wincompatible-pointer-types]
  cp->n = X->dim;
        ^ ~~~~~~
In file included from src/owl/core/owl_ndarray_contract_stub.c:26:
src/owl/core/owl_ndarray_contract_impl.h:88:9: warning: incompatible pointer types assigning to 'int64_t *' (aka 'long long *') from 'intnat[]' (aka 'long[]') [-Wincompatible-pointer-types]
  cp->n = X->dim;
        ^ ~~~~~~
In file included from src/owl/core/owl_ndarray_contract_stub.c:35:
src/owl/core/owl_ndarray_contract_impl.h:88:9: warning: incompatible pointer types assigning to 'int64_t *' (aka 'long long *') from 'intnat[]' (aka 'long[]') [-Wincompatible-pointer-types]
  cp->n = X->dim;
        ^ ~~~~~~
In file included from src/owl/core/owl_ndarray_contract_stub.c:44:
src/owl/core/owl_ndarray_contract_impl.h:88:9: warning: incompatible pointer types assigning to 'int64_t *' (aka 'long long *') from 'intnat[]' (aka 'long[]') [-Wincompatible-pointer-types]
  cp->n = X->dim;
        ^ ~~~~~~
4 warnings generated.

I expect that proper Float16 support will need more work.

mtk2xfugaku commented 6 months ago

Most scientific computing libraries support higher precision floating point operations (floats and doubles), to support float16 or bfloat16 or more exotic float8 the common BLAS functions has to be implemented with reduced precision. To support float16 now in Owl, the linear algebra operation can be done in higher precision (float32) by converting float16 array buffer to float32 array buffer and then stored back again in float16 array buffer.

Raising precision would cause some overhead but not that much, also accumulation in higher precision (float32) is also more stable. This is what ml_types library does for accumulation.

tmcgilchrist commented 6 months ago

@mtk2xfugaku thank you for the details, that makes sense. I'm unlikely to have time to do that work, my main goal was to get it compiling for Sandmark GC regression tests.

patrick-nicodemus commented 5 months ago

@tmcgilchrist I cloned your branch and ran the tests and everything passes. Based on your experience, what else do you feel needs to be done before this can be merged?

I expect that proper Float16 support will need more work.

In your opinion could this be left to a future project, or is this a necessary prerequisite before users of OCaml 5.2 can reliably use Owl?

jzstark commented 1 month ago

This compilation issue is fixed in the current Owl.