dotnet / corert

This repo contains CoreRT, an experimental .NET Core runtime optimized for AOT (ahead of time compilation) scenarios, with the accompanying compiler toolchain.
http://dot.net
MIT License
2.91k stars 508 forks source link

Enable System.Runtime.Intrinsics intrinsics #6173

Open MichalStrehovsky opened 6 years ago

MichalStrehovsky commented 6 years ago

Since RyuJIT already supports this, I think we just need these things:

MichalStrehovsky commented 6 years ago

@Alan-FGR this is up-for-grabs :)

jkotas commented 6 years ago

Pass flags to RyuJIT to enable generation of SIMD code

We need command line switch that describes the minimum hardware you expect to be running on (ideally, this would be combined with runtime check during startup that verifies the minimum requirements - it will save us from debugging mysterious crashes).

MichalStrehovsky commented 6 years ago

Cc @tannergooding who might be able to give us more pointers.

tannergooding commented 6 years ago

We need command line switch that describes the minimum hardware you expect to be running on (ideally, this would be combined with runtime check during startup that verifies the minimum requirements - it will save us from debugging mysterious crashes).

I agree that this is a good baseline. However, It may also be interesting to have the CoreRT startup code perform and cache the CPUID checks as a one time cost. This allows AOT code to support higher level hardware than w/e the baseline is decided to be. The C Runtime library (both glibc and mscrt) does this for many of the math functions, for example.

The other tracked work items look correct as well.

It may be interesting to document that Vector64<T>, Vector128<T>, and Vector256<T> correspond to the __m64, __m128, and __m256 primitive types defined by most ABIs as part of the process.

4creators commented 6 years ago

This allows AOT code to support higher level hardware than w/e the baseline is decided to be. The C Runtime library (both glibc and mscrt) does this for many of the math functions, for example.

In GCC this feature is named Function Multi Versioning - FMV and is supported in evolving form since GCC v4.8 (C++ only). Essentially it compiles for all architectures indicated in attribute and at runtime c-runtime fixes RVAs based on architecture test.

The very same mechanism was already proposed earlier during discussion of HW Intrinsics for R2R assemblies. IMHO it would be one of the most important features to implement in .NET Core to fully exploit SIMD potential. This may support both System.Numerics.Vector<T> and HW intrinsics.

     __attribute__ ((target ("sse4.2")))
    int foo(){
    // foo version for SSE4.2
    return 1;
    }
    __attribute__ ((target ("arch=atom")))
    int foo(){
    // foo version for the Intel Atom processor
    return 2;
    }

    int main() {
    int (*p)() = &foo;
    assert((*p)() == foo());
    return 0;
    }

Looks pretty similar to Sse.IsSupported