riscv-non-isa / riscv-c-api-doc

Documentation of the RISC-V C API
https://jira.riscv.org/browse/RVG-4
Creative Commons Attribution 4.0 International
75 stars 41 forks source link

Function multi-version proposal #48

Closed BeMg closed 4 months ago

BeMg commented 1 year ago

During the Function multi-version dispatch the function, we need a method to retrieve the RISC-V hardware environment to make sure all extension must be available.

The problem is

From the compiler's view, it will generate the IFUNC resolver when there are more than one implementation with the same symbol name.

Consider following example:

__attribute__((target("default"))) int foo (int index)
{
  return index;
}

__attribute__((target("arch=rv64gc"))) int foo (int index)
{
 return index;
}

void bar() {
  foo(0);
}

The corresponding assembly will look like:

bar() {
(foo.ifunc())(0);
}

.set foo.ifunc, foo.resolver

func_ptr foo.resolver() {
  if (__riscv_ifunc_select("m_a_f_d_c"))
    return ptr foo.m_a_f_d_c;
  return ptr foo.default;
}

int foo.default(int index) {
    ...
}

int foo.m_a_f_d_c(int index) {
    ...
}

The resolver that the compiler generates query and selects for each candidate function. When fulfilling the requirement, then return the corresponding function ptr for further processing.

In this proposal, the major part of the resolver function is __riscv_ifunc_select. __riscv_ifunc_select must retrieve the hardware information for deciding whether to execute the specific function.

Here we propose that function as the following declaration

bool __riscv_ifunc_select(char *FeatureStr);

Where FeatureString is a string that concatenates all target features belonging to a particular function. The form can be described in the following BNF form.

When hardware fulfills the FeatureStr, then returns true. Otherwise this function returns false.


2023/09/04 Update: The following section take the linux platform as example for __riscv_ifunc_select implementation.

There are two ways to retrieve hardware information.

Another problem is where to place the function definition.

The compiler-rt/libgcc is a good place to implement these functions, like other target(x86/aarch64) implementation.

[1] https://docs.kernel.org/riscv/hwprobe.html


2024/07/11 Update

  1. Remain the syntax part, the runtime function move to another RFC.
  2. Remove arch=<full-arch-string> from syntax
sorear commented 1 year ago

This repository is for specifications of features that are portable between multiple RISC-V toolchains. As such it is inappropriate to specify any behavior exclusively in terms of Linux-only interfaces like hwprobe and cpuinfo.

Providing a string-to-bool or string-to-int (for things like Zicbo* cache block size) lookup interface as a portable frontend to Linux's syscalls, the HWCAP-inspired interfaces on the BSDs, and whatever NT ends up with is a good idea, although it's useful for more than just ifunc; @jrtc27 suggested __riscv_get_extension for essentially this interface.

sorear commented 9 months ago

Can arch= be removed from target_version and target_clones, since nothing else can appear there? Then it becomes just [[gnu::target_clones("+zbb","default")]] or similar.

How are versions and clones prioritized? A big list of every possible exception isn't going to work for us, so it should be something in the source code. Not sure if declaration order would cause problems.

BeMg commented 8 months ago

Hi @sorear, thanks for the comment.

Can arch= be removed from target_version and target_clones, since nothing else can appear there? Then it becomes just [[gnu::target_clones("+zbb","default")]] or similar.

For target_version and target_clones's ATTR-STRING, I tend to reuse the format like target attribute to avoid confusion. Removing mtune and mcpu could reduce the complexity of usage and could be treated as a subset, but using the format like [[gnu::target_clones("+zbb","default")]] is more inconsistent between target attribute and target_clones/target_version.

From the compiler's perspective, mtune, mcpu information will be highly related to the compilation result. If someday, we need to add mtune and mcpu to target_version /target_clones's ATTR-STRING, it will be easier and not break the the existing code.

@kito-cheng any other comments on this topic?

How are versions and clones prioritized? A big list of every possible exception isn't going to work for us, so it should be something in the source code. Not sure if declaration order would cause problems.

Currently, the selection order depend on IFUNC resolver's implementation. We plan to add the one another option inside ATTR-STRING that represents the user's manual priority weight. Like target_clones("default", "arch=rv64gc;prior=5", "arch=rv64g;prior=7").

BeMg commented 8 months ago

I have created a pull request https://github.com/llvm/llvm-project/pull/85786 to LLVM to implement target_clones in the current proposal.

I have also created a draft pull request https://github.com/llvm/llvm-project/pull/85790 to implement __riscv_ifunc_select, which allows target_clones to run in a QEMU environment.

For example

clang -march=rv64g -rtlib=compiler-rt targetclones.c
qemu-riscv64 -cpu rv64,zbb=true,zba=true -B 0x100000 -L /path/to/sysroot a.out
BeMg commented 4 months ago

Rebase to origin/main

BeMg commented 4 months ago

Update: Remove the arch=<full-arch-string> from syntax

preames commented 2 months ago

For anyone following along, there is a follow on proposal to add syntax to assist with selection between multiple candidates, see https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85