statgen / savvy

Interface to various variant calling formats.
Mozilla Public License 2.0
26 stars 5 forks source link

Build failure on arm64: Templates not found because char is unsigned #20

Closed pettyalex closed 2 years ago

pettyalex commented 2 years ago

Savvy can't build or run on arm64 right now, because char is unsigned on arm64 and this code assumes that char will be signed. The C/C++ specification says nothing about the signed-ness of char by default, so if char is signed varies by compiler / architecture / implementation.

Edit about my specific environment: I'm on an Apple Silicon M1 Mac, building this both on mac OS and also Ubuntu on arm64 inside of Docker. Our intent is to evaluate the Graviton AWS instances, which would need tooling to work and perform well on arm64.

Some quick ways to work around this would be to either specify that these types are signed chars, or to update the template so that it works on unsigned types. I haven't investigated deeper and barely know C++, so I'd appreciate your help with a solution.

Specifically, I see that typed_value.hpp we have some char vectors https://github.com/statgen/savvy/blob/master/include/savvy/typed_value.hpp#L1712

    std::vector<char> off_data_;
    std::vector<char> val_data_;

and then expect to be able to operate on them with this template https://github.com/statgen/savvy/blob/master/include/savvy/typed_value.hpp#L125:

    template<typename T>
    static typename std::enable_if<std::is_signed<T>::value, std::uint8_t>::type
    type_code(const T& val);

Stack trace from build failure on arm64:

include/savvy/typed_value.hpp: In instantiation of ‘void savvy::typed_value::thin_types_fn::operator()(T*, T*, savvy::typed_value*) [with T = char]’:
include/savvy/typed_value.hpp:712:11:   required from ‘bool savvy::typed_value::apply_dense(Fn, Args ...) [with Fn = savvy::typed_value::thin_types_fn; Args = {savvy::typed_value*}]’
include/savvy/typed_value.hpp:1412:42:   required from here
include/savvy/typed_value.hpp:1331:44: error: no matching function for call to ‘savvy::typed_value::type_code(char&)’
 1331 |           new_val_type = std::max(type_code(max_val), type_code(min_val));
      |                                   ~~~~~~~~~^~~~~~~~~
In file included from include/savvy/site_info.hpp:12,
                 from include/savvy/region.hpp:10,
                 from include/savvy/s1r.hpp:11,
                 from include/savvy/sav_reader.hpp:12,
                 from include/sav/sort.hpp:10,
                 from src/sav/main.cpp:14:
include/savvy/typed_value.hpp:122:32: note: candidate: ‘template<class T> static uint8_t savvy::typed_value::type_code()’
  122 |     inline static std::uint8_t type_code();
      |                                ^~~~~~~~~
include/savvy/typed_value.hpp:122:32: note:   template argument deduction/substitution failed:
In file included from include/savvy/site_info.hpp:12,
                 from include/savvy/region.hpp:10,
                 from include/savvy/s1r.hpp:11,
                 from include/savvy/sav_reader.hpp:12,
                 from include/sav/sort.hpp:10,
                 from src/sav/main.cpp:14:
include/savvy/typed_value.hpp:1331:44: note:   candidate expects 0 arguments, 1 provided
 1331 |           new_val_type = std::max(type_code(max_val), type_code(min_val));
      |                                   ~~~~~~~~~^~~~~~~~~
include/savvy/typed_value.hpp:1864:73: note: candidate: ‘template<class T> static typename std::enable_if<std::is_signed<_Tp>::value, unsigned char>::type savvy::typed_value::type_code(const T&)’
 1864 |   typename std::enable_if<std::is_signed<T>::value, std::uint8_t>::type typed_value::type_code(const T& val)
      |                                                                         ^~~~~~~~~~~
include/savvy/typed_value.hpp:1864:73: note:   template argument deduction/substitution failed:
include/savvy/typed_value.hpp: In substitution of ‘template<class T> static typename std::enable_if<std::is_signed<_Tp>::value, unsigned char>::type savvy::typed_value::type_code(const T&) [with T = char]’:
include/savvy/typed_value.hpp:1331:44:   required from ‘void savvy::typed_value::thin_types_fn::operator()(T*, T*, savvy::typed_value*) [with T = char]’
include/savvy/typed_value.hpp:712:11:   required from ‘bool savvy::typed_value::apply_dense(Fn, Args ...) [with Fn = savvy::typed_value::thin_types_fn; Args = {savvy::typed_value*}]’
include/savvy/typed_value.hpp:1412:42:   required from here
include/savvy/typed_value.hpp:1864:73: error: no type named ‘type’ in ‘struct std::enable_if<false, unsigned char>’
pettyalex commented 2 years ago

Here's one potential fix, marking those as signed chars. I haven't checked if this is the only place in the codebase that assumes char is signed.

https://github.com/statgen/savvy/pull/21

I actually haven't gotten savvy to build locally, cget is not working for me:

[100%] Building xz ...
make[3]: *** No targets specified and no makefile found.  Stop.
make[2]: *** [CMakeFiles/xz] Error 2
make[1]: *** [CMakeFiles/xz.dir/all] Error 2
make: *** [all] Error 2
pettyalex commented 2 years ago

Hmm. By ignoring XZ, my build is working again.

It seem slike char would be have to be marked signed char in a lot of places. What if instead that template always matched, instead of only if the type was signed?

jonathonl commented 2 years ago

I haven't fully read through your post or PR, but you may be able to force your chars to be signed depending on the compiler. Try configuring with -DCMAKE_CXX_FLAGS="-fsigned-char" (see https://github.com/statgen/savvy/issues/13). I'll try to read through this issue more thoroughly tomorrow.

pettyalex commented 2 years ago

Thanks.

I haven't looked through the whole flow, but if char needs to be signed, then maybe it's best to mark it as such, and if not, there are performance implications where platforms with char unsigned have done so for performance reasons.

jonathonl commented 2 years ago

The template arguments of off_data_ and val_data_ are irrelevant when it comes to the type_code function and should be left as char. Also, removing std::enable_if is not a usable solution. These were added intentionally to prevent users (tool developers) of the Savvy library from writing unsigned integers to BCF files (which BCF does not support).

A better solution would be std::enable_if<std::is_signed<T>::value || std::is_same<T, char>::value, std::uint8_t>::type. I vaguely remember there being an issue with this approach the last time I looked into this, but maybe I just never got around to implementing it. I will try to implement again, but it may take a few days for me to find time.

Does adding the '-fsigned-char' compile option not fix the build errors on your Apple system? This would be a quick fix that involves no code changes, and I don't expect there to be any performance issues.

pettyalex commented 2 years ago

Interesting.

If the format does not allow writing unsigned integers, then writing unsigned chars would be a problem as well, right? So potentially allowing unsigned chars to be written would cause the same problems that writing unsigned ints would?

Does this flow intentionally allow chars in general, or should int8_t be used instead for consumers of this lib who intend to write 8-bit types?

jonathonl commented 2 years ago

No, the char type in the BCF spec is for string data while int8 is for 8-bit integer data. The example below shows how this works with the Savvy library.

std::string str_data = "string_data"; // std::string == std::basic_string<char>
std::vector<std::int8_t> int_data = {2, 32};
savvy::variant record;
record.set_info("ANN", str_data); // BCF-encoded with char type
record.set_info("AC", int_data); // BCF-encoded with int8 type
jonathonl commented 2 years ago

This should be fixed with https://github.com/statgen/savvy/commit/3842f47ec4ea1328c1963512bfe2e32412bfcfc1

pettyalex commented 2 years ago

Yes, that fixed it!