janestreet / ocaml_intrinsics

Provides functions to invoke amd64 instructions (such as clz,popcnt,rdtsc,rdpmc) when available, or compatible software implementation on other targets.
MIT License
30 stars 12 forks source link

Supporting other Tier-1 architectures #9

Open tmcgilchrist opened 8 months ago

tmcgilchrist commented 8 months ago

First some context, I am using this package via tracing in runtime_events_tools to provide Fuchsia Trace Format files. In the context of runtime_events_tools I would like to support all Tier-1 OCaml architectures with such a fundamental tool.

Tracing uses a single function from this library Ocaml_intrinsics.Perfmon.rdtsc () (via Core_unix.time_stamp_counter) to access the processor's time-stamp counter. An equivalent to this instruction is available on most architectures and is exposed with LLVMs built in __builtin_readcyclecounter() (https://godbolt.org/z/oj94PxMo1). Would a PR providing the missing architectures ppc64, RiscV and s390x be accepted?

There are some functions provided in this library that do not have equivalents on all Tier-1 architectures like CRC32 and prefetching. This causes problems when using single functions or small groups of functions from this library on non-x86_64 architectures. Has a general solution for this problem been thought of for this library? Some places return default values and others like prefetching error out on compile.

Finally the test suite is not runnable via dune runtest. Could this be replaced with a more recent version of dune expect_tests?

 $ dune runtest
Error: No rule found for test/inline_tests_runner
-> required by alias test/runtest in test/dune:8
gretay-js commented 8 months ago

First some context, I am using this package via tracing in runtime_events_tools to provide Fuchsia Trace Format files. In the context of runtime_events_tools I would like to support all Tier-1 OCaml architectures with such a fundamental tool.

Tracing uses a single function from this library Ocaml_intrinsics.Perfmon.rdtsc () (via Core_unix.time_stamp_counter) to access the processor's time-stamp counter. An equivalent to this instruction is available on most architectures and is exposed with LLVMs built in __builtin_readcyclecounter() (https://godbolt.org/z/oj94PxMo1). Would a PR providing the missing architectures ppc64, RiscV and s390x be accepted?

sure, the difficulty is that we don't test on these architectures.

There are some functions provided in this library that do not have equivalents on all Tier-1 architectures like CRC32 and prefetching. This causes problems when using single functions or small groups of functions from this library on non-x86_64 architectures. Has a general solution for this problem been thought of for this library? Some places return default values and others like prefetching error out on compile.

Our plan is to provide default operations that work correctly on all targets but may be ineffecient. Towards this end, we have already started splitting up ocaml_intrinsics into two libraries. The first one, ocaml_intrinsics_kernel is intended to build and run correctly on all targets (and can be used in base and js_of_ocaml). The other contains x86 specific instructions like rdtcs and crc32. This change should appear in the next release of janestreet libraries. If we implement something like __builtin_readcyclecounter on all targets, we can move rdtsc over to ocaml_intrinsics_kernel library. Prefetching can be replaced with a NOP on other architectures, instead of error to build. We don't yet have CRC32 implementation, but could probably just implement it in OCaml.

This approach is not ideal, because performance may be silently affected and we can't test on all targets properly. I'm interested in other suggestions.

Finally the test suite is not runnable via dune runtest. Could this be replaced with a more recent version of dune expect_tests?

I didn't expect dune runtest to work, I don't think it works for base either. @d-kalinichenko do we have any plans for supporting it?

In the meantime, I'll remove the dependency on inline_test_runner from the dune file in test directory, so that dune runtest in ocaml_intrinsics quietly does nothing (similarly to base, I think) instead of failing.

 $ dune runtest
Error: No rule found for test/inline_tests_runner
-> required by alias test/runtest in test/dune:8
dkalinichenko-js commented 8 months ago

Re dune runtest: we are working on making our tests runnable outside Jane Street, but it's planned for the v0.18 release.