Open MaxPayne86 opened 1 month ago
Interesting, thanks for sharing... I guess there's a couple of layers of things going on here.
My first question is if Eigen and XSIMD work on your target platform in the first place? It's totally possible that there's some incompatibilities going on at their level(s). However, if you're able to use Eigen or XSIMD on your platform outside of RTNeural, then obviously there's some things we'll need to change in RTNeural.
For the CMake configuration, we should probably make a stronger effort to check what the "correct" alignment is for the target platform, rather than setting it to 8 by default... is this something that you might know how to do? We might also want to work out a way for the user to "manually" override the default alignment without needing to edit RTNeural's CMake config.
The provided crash log is also a bit curious to me... specifically looking at the line T = Eigen::Map<Eigen::Matrix<float, 12, 1, 0, 12, 1>, 16, Eigen::Stride<0, 0>>
. The template type definition for Eigen::Map
is Eigen::Map<MatrixType, Alignment, Stride>
, implying that Eigen still thinks that the requested alignment is 16 bytes. Would is be possible to add a check (e.g. static_assert (RTNEURAL_DEFAULT_ALIGNMENT == 8)
), somewhere in your code, just to double-check that?
if Eigen and XSIMD work on your target platform in the first place?
Interesting point, are they packetized in debian? If so I can install the package and then run some built-in tests?
is this something that you might know how to do?
I would put in cmake/SIMDExtensions.cmake
if(NOT RTNEURAL_USE_AVX)
if(CMAKE_SYSTEM_PROCESSOR MATCHES "armv7")
target_compile_definitions(RTNeural PUBLIC RTNEURAL_DEFAULT_ALIGNMENT=8)
else()
target_compile_definitions(RTNeural PUBLIC RTNEURAL_DEFAULT_ALIGNMENT=16)
endif()
else()
implying that Eigen still thinks that the requested alignment is 16 bytes
Yes, just checked RTNeural code now, RTNeural/common.h
#if RTNEURAL_DEFAULT_ALIGNMENT == 32
constexpr auto RTNeuralEigenAlignment = Eigen::Aligned32;
#else
constexpr auto RTNeuralEigenAlignment = Eigen::Aligned16;
#endif
it seems to me we need to also expand this to allow for Eigen::Aligned8
UPDATE: so I've moved forward by adding in RTNeural/common.h
#if RTNEURAL_DEFAULT_ALIGNMENT == 32
constexpr auto RTNeuralEigenAlignment = Eigen::Aligned32;
#elif RTNEURAL_DEFAULT_ALIGNMENT == 16
constexpr auto RTNeuralEigenAlignment = Eigen::Aligned16;
#elif RTNEURAL_DEFAULT_ALIGNMENT == 8
constexpr auto RTNeuralEigenAlignment = Eigen::Aligned8;
#else
#error "Unsupported alignment"
#endif
but during compilation I still see warnings such as
RTNeural/modules/Eigen/Eigen/src/Core/arch/NEON/Complex.h:281:37: warning: requested alignment 16 is larger than 8 [-Wattributes]
by opening the above file, I see
template<> EIGEN_STRONG_INLINE std::complex<float> pfirst<Packet1cf>(const Packet1cf& a)
{
EIGEN_ALIGN16 std::complex<float> x;
vst1_f32(reinterpret_cast<float*>(&x), a.v);
return x;
}
template<> EIGEN_STRONG_INLINE std::complex<float> pfirst<Packet2cf>(const Packet2cf& a)
{
EIGEN_ALIGN16 std::complex<float> x[2];
vst1q_f32(reinterpret_cast<float*>(x), a.v);
return x[0];
}
but I was able to spot other warnings in RTNeural/modules/Eigen/Eigen/src/Core/arch/Default/GenericPacketMathFunctions.h, still need to look deeply
Interesting point, are they packetized in debian? If so I can install the package and then run some built-in tests?
I was more thinking you could try to compile a program using either Eigen or XSIMD, but without using RTNeural. I'm not sure if the libraries are packaged in any way, but the source code for both libraries is available on GitHub/GitLab, and in both cases I believe the source code includes some example programs that you could try compiling and running.
The proposed changes to SIMDExtensions.cmake
and common.h
look correct to me! If you'd like to make a pull request with those changes, that would be great!
The remaining warnings coming from the Eigen headers are likely there because Eigen wants some data types to be guaranteed to be aligned to 16 bytes, although I'm not 100% sure what their reasons are for wanting that. For the most part RTNeural doesn't really interact with the type information that is then passed to Eigen. For example, an RTNeural::Dense<float>
will likely result in the creation of an Eigen::Matrix<float>
. So you probably don't actually need std::complex<float>
and may want to modify Eigen to reflect that, which would likely silence those warnings.
In all, I believe the remaining warnings that you're seeing are happening because of the relationship between your compiler/toolchain and Eigen, and aren't directly related to RTNeural.
Update on this issue
The proposed changes to SIMDExtensions.cmake and common.h look correct to me! If you'd like to make a pull request with those changes, that would be great!
Completed as per PR merge
So you probably don't actually need std::complex
I confirm that, tested with proposed PR in place I am now able to execute template impl. without crashes
./rtneural_layer_bench lstm 10 1 12
Benchmarking lstm layer, with input size 1 and output size 12, with signal length 10 seconds
Processed 10 seconds of signal in 7.71392 seconds
1.29636x real-time
Testing templated implementation...
Processed 10 seconds of signal in 6.50054 seconds
1.53833x real-time
Templated layer is 1.18666x faster!
however, as you can notice RTNEURAL_DEFAULT_ALIGNMENT=8 seems less performant than RTNEURAL_DEFAULT_ALIGNMENT=16 on a 32-bit processor using EIGEN.
XSIMD wip...
Awesome! It does make sense that using 8-byte alignment will be slower than 16-byte alignment if Eigen is trying to use certain SIMD intrinsics, since they'll probably end up needing to do a lot more "unaligned load" operations. Obviously that will depend on the platform architecture, and the specifics of what Eigen is trying to do under the hood.
Hi, using RTNeural 4a540403e115bae18d29142a5f54e7c3598b6e51
for the given env, cmake/SIMDExtensions.cmake set RTNEURAL_DEFAULT_ALIGNMENT=16. During build, I will see a lot of warnings like
warning: requested alignment 16 is larger than 8
this is related to Eigen backend compilation.
So far so good, I'm able to execute both dynamic and template
now I temporarily edited cmake/SIMDExtensions.cmake and set RTNEURAL_DEFAULT_ALIGNMENT=8, then recompile. The build ends without errors, then when executing
I would conclude that RTNEURAL_DEFAULT_ALIGNMENT=8 is not supported by Eigen backend. However, it should be the correct alignment for a 32-bit processor.
NOTES: I'm not able to reporting XSIMD in this very same env since I'm experiencing several compilation errors, WIP -- The C compiler identification is GNU 8.3.0 -- The CXX compiler identification is GNU 8.3.0