db-tu-dresden / TSL

Template SIMD Library (+Generator)
GNU General Public License v3.0
9 stars 8 forks source link

[NEON] Unknown type name `neon` in tslCPUrt.hpp #96

Open lawben opened 4 months ago

lawben commented 4 months ago

When using tsl::runtime::cpu::max_width_extension_t on AArch64, I get the following error message:

runtime/cpu/include/tslCPUrt.hpp:15:29: error: unknown type name 'neon'
   15 |         using extension_t = neon;
      |                             ^

runtime/cpu/include/tslCPUrt.hpp:22:23: error: use of undeclared identifier 'scalar'
   22 |             (Par==1), scalar, typename details::simd_ext_helper_t<sizeof(T)*8*Par>::extension_t
      |                       ^

These types are not included, so they cannot be known.

Side note: same holds for VectorProcessingStyle and TSLArithmetic further down in the code. I'm not using them, but my IDE shows me that they are undefined. Probably just a few includes missing.

JPietrzykTUD commented 4 months ago

Can you provide me the output of the following bash command: LANG=en;lscpu | grep 'Flags:' | sed -E 's/Flags:\s*//g' | sed -E 's/\s/:/g'

As far as I can tell, the neon extension is inside the tarball.

alexKrauseTUD commented 4 months ago

Could this be a naming issue with the analogy to asimd? The neon extension comes with a synonym for asimd, depending on the platform.

lawben commented 4 months ago

I'm running this on Mac, so I don't have lscpu...

But I think this is beside the point. It looks like a regular C++ problem. The symbol neon is simply not defined here. The neon struct is declared in include/generated/extensions/simd/arm/neon.hpp but that file is not included, so the compiler does not know it when including the header. I think this just requires a few #includes at the top of the file.

JPietrzykTUD commented 4 months ago

This is strange. From https://github.com/db-tu-dresden/TSL/releases/tag/v0.1.9-rc5, within the NEON sub-directory, the tsl.hpp includes:

#include "include/tslintrin.hpp"
#include "supplementary/runtime/cpu/include/tslCPUrt.hpp"

include/tslintr.hpp includes:

#include "generated/tsl_generated.hpp"

and generated/tsl_generated.hpp includes:

#include "extensions/simd/arm/neon.hpp"
alexKrauseTUD commented 4 months ago

Hi Lawrence,

thanks for testing our library! I threw together a quick toy example using the latest tarball:

curl -L "https://github.com/db-tu-dresden/TSL/releases/latest/download/tsl.tar.gz" -o tsl.tar.gz

I moved the contents of generate_tsl_neon-asimd to /usr/include/tsl to avoid unnecessary editing of any include paths.

Here is my example main:

#include <tsl/tsl.hpp>
#include <iostream>

int main() {
        using ps = tsl::simd< uint32_t, tsl::neon >;
        tsl::executor<tsl::runtime::cpu> exec;

        typename ps::base_type* result_ptr = exec.allocate<typename ps::base_type>(1024, 64);

        std::cout << "Neon? " << tsl::type_name<ps>() << std::endl;
        exec.deallocate(result_ptr);
        std::cout << tsl::simd<uint32_t, typename tsl::runtime::cpu::max_width_extension_t>::vector_element_count() << std::endl;

        return 0;
}

I built it with

g++ -o tsltest main.cpp -flax-vector-conversions

and got the output:

$ ./tsltest
Neon? tsl::simd<unsigned int, tsl::neon, 128ul>
4

I tested this on an ODROID N2+:

$ lscpu
Architecture:           aarch64
  CPU op-mode(s):       32-bit, 64-bit
  Byte Order:           Little Endian
[...]
Vendor ID:              ARM
  Model name:           Cortex-A53
    Model:              4
[...]
    Flags:              fp asimd evtstrm aes pmull sha1 sha2 crc32
[...]

When I try to build with clang, I only see the errors you reported in #95.

Could you please provide an MWE of what you tried/how you setup the TSL on your machine? I am currently unable to reproduce the issue.

lawben commented 4 months ago

Well, this was a stupid issue 😅 The main issues was that I included files in the wrong order...

#include "tslCPUrt.hpp"
#include "tslintrin.hpp"

You might want to guard against this or tell users to include tsl.hpp directly instead. Your example currently says to include tslintrin.hpp.

I didn't use tsl.hpp directly. In my workflow, I download the tar.gz, extract it, and then use add_subdirectory(...) with the correct path. Then I use target_link_libraries(my_target PRIVATE tsl), which should do the rest. But the target does not include the "root" directly that contains tsl.hpp, so I need to manually add it on my end in cmake. Ideally, the tsl target created by you would contain the path to the directory containing tsl.hpp, so that cmake can do all its magic when "linking" against tsl.

alexKrauseTUD commented 4 months ago

Glad its not a generator issue :) We will address the documentation oversight asap.

lawben commented 4 months ago

I'd really suggest to also try to fix this programmatically, as, e.g., clang format reorders the includes. This puts quite a bit of burden on the user, as they need to know about the order and then ensure that it does not get reordered.

alexKrauseTUD commented 4 months ago

Given that only tsl/tsl.hpp is required or, if you clone the repository or include it via cmake, you only need to use tslintrin.hpp ... I think this might be not necessary. However we will definitly consider your suggestion. Highly appreciated!

JPietrzykTUD commented 4 months ago

@lawben has a point, though. We really should change the docu to be more specific and unify the includes. It should be irrelevant how you use the TSL (install vs. cmake) - the necessary headers to include should be the same.

Regarding clang-format, I am completely unaware about the extent of include-reordering. Can this be an issue for the library internals as well?

lawben commented 4 months ago

At least in my setup, clang format sorts includes alphabetically. In that case, tslC comes before tsli. This does not affect your library internals. But if you external interface relies on implcit ordering, that's probably not ideal from a user perspective.