Open jasonthorsness opened 7 months ago
Is this expected at this time?
Yes, we don't support SIMD yet.
Well I tried the same code in Blazor AOT which supposedly supports SIMD and it's not any faster; they must not support these operations - on my system this is 120ms natively compiled, 2400 ms Blazor, 1600 ms NativeAOT-LLVM
public class Class1
{
[UnmanagedCallersOnly(EntryPoint = "Alloc")]
public static unsafe byte* Alloc(int length)
{
return (byte*)NativeMemory.AlignedAlloc((nuint)length, (nuint)Vector<byte>.Count);
}
[UnmanagedCallersOnly(EntryPoint = "Answer")]
public static unsafe int Answer(byte* f, int l)
{
for (int i = 0; i < 10000; ++i)
{
for (byte* ptr = f; ptr != f + l; ptr += Vector<byte>.Count)
{
(~Vector.LoadAligned(ptr)).StoreAligned(ptr);
}
}
return Vector<byte>.Count + (Vector.IsHardwareAccelerated ? 100 : 1000);
}
}
Blazor used:
<RunAOTCompilation>true</RunAOTCompilation>
<WasmEnableSIMD>true</WasmEnableSIMD>
Would it be straightforward to link in a C or C++ file with SSE2 intrinsics and have Emscripten translate it? Any examples? (sorry this doesn't seem appropriate for issue; not sure where else to discuss/ask questions)
Would it be straightforward to link in a C or C++ file with SSE2 intrinsics and have Emscripten translate it? Any examples?
With NativeAOT-LLVM, you would first need to compile the native code into a native library. For the case of a single .c
file, it can be as simple as:
; See https://emscripten.org/docs/porting/simd.html#compiling-simd-code-targeting-x86-sse-instruction-sets for SSE compatibility flags.
emcc -msimd128 -c lib.c -O2 -o lib.o
<NativeLibrary Include="lib.o" /> ; Statically linked code, use direct PInvoke to invoke it.
You do need to use a matching version of Emscripten, however.
https://learn.microsoft.com/en-us/aspnet/core/blazor/webassembly-native-dependencies?view=aspnetcore-8.0 is the documentation for how to do the same using the upstream toolchain - it supports compiling source files directly.
Just wanted to note; this works great - same test above using the WASM SIMD functions directly is only ~330 ms which seems expected; the natively-compiled version code is twice as fast (likely because it gets to use 256-bit vectors on my machine instead of 128-bit) and the WASM SIMD version is roughly 4 times faster than the Vector version.
In case anyone sees this I just put this in my project file:
<ItemGroup>
<DirectPInvoke Include="lib" />
<NativeLibrary Include="lib.o" />
</ItemGroup>
<Target Name="CompileNativeLibrary" BeforeTargets="BeforeBuild">
<Exec Command="emcc -msimd128 -c lib.c -O2 -o lib.o" />
</Target>
Then in the code
[LibraryImport("lib")]
internal static unsafe partial void bar(byte* ptr, int n);
And for this test lib.c file is just this:
#include <stddef.h>
#include <wasm_simd128.h>
void bar(uint8_t* ptr, int length) {
v128_t* simd_ptr = (v128_t*)ptr;
size_t num_vectors = length / sizeof(v128_t);
v128_t ones = wasm_i32x4_splat(~0);
for (size_t i = 0; i < num_vectors; ++i) {
v128_t current_vector = wasm_v128_load(simd_ptr + i);
v128_t inverted_vector = wasm_v128_xor(current_vector, ones);
wasm_v128_store(simd_ptr + i, inverted_vector);
}
}
I understand that browsers mostly support WebAssembly SIMD and so does Emscripten
I am seeing Vector.IsHardwareAccelerated return false from my app compiled with
Is this expected at this time? Thanks!