Fix PrimType mapping in C backend

michael-kenzel commented 2 years ago

The C backend currently performs an unconditional mapping of the fixed-width PrimTypes to fundamental types in C. However, the exact sizes of fundamental types in C are implementation-defined and generally depend on the ABI of the target platform. For example, the C backend in its current form will always map u64 to long. This would result in potentially incorrect codegen on Windows since long is a 32-Bit type in the Windows ABI. While long is a 64-Bit type under some circumstances in CUDA, it isn't a 64-Bit type under all circumstances (potentially depends on the host compiler). CUDA generally uses long long as its 64-Bit integer type. In addition to the potential problems noted before, the use of long here results in code that uses 64-Bit integer atomics failing to compile since CUDA only has overloads for unsigned long long for 64-Bit atomics, none for unsigned long.

After some discussion on Discord, it was decided that the best way to fix this would be to emit a preamble of typedefs to perform the mapping to the appropriate fundamental types depending on the target C dialect. For plain standard C, we emit typedefs that map to the fixed-width integer types. We also only map f32 and f64 if __STDC_IEC_559__ is defined since there are otherwise no guarantees that float and double are the appropriate IEEE 754 floating point types. For OpenCL and CUDA, we perform the mapping according to the guaranteed sizes as noted in the respective specification/documentation.

This pull request delivers a proposed implementation of this fix.

michael-kenzel commented 2 years ago

As an additional note: Since this fix required emitting longer, unformatted strings, a raw .write() function was added to the Stream in order to avoid having to go through .fmt(). In the process of doing so, I also updated the existing code to use .write() instead of .fmt() in places where it was outputting strings without any formatting. I also changed some functions that were using std::string to simply return the contents of string literals to use std::string_view instead. Some of these changes were rolled back in the last commit after feedback on Discord.

Hugobros3 commented 2 years ago

Regarding fmt, we could perhaps have a fast-path with a template specialization selected when there aren't any additional arguements. I'm not a C++ template guru but that sounds doable

michael-kenzel commented 2 years ago

Regarding fmt, we could perhaps have a fast-path with a template specialization selected when there aren't any additional arguements. I'm not a C++ template guru but that sounds doable

That would certainly be doable, pretty simple to do actually. However, at least to me, .fmt implies that the first string parameter is a template string for formatting, which implies stuff like you need to escape { and }. If we just added a specialization for .fmt to do a raw print in case no other arguments are provided, then that would mean the meaning of the first parameter changes depending on whether other arguments are present or not which, imo, is not ideal. So I think this is better served by having a separate method rather than having two versions of the same method with different semantics…

richardmembarth commented 2 years ago

Why do you not use our own types when emitting the code, that is, i32 and u32 vs. s32 and u32?

richardmembarth commented 2 years ago

__STDC_IEC_559__ is not defined by the compiler, shouldn't we remove it? https://stackoverflow.com/questions/31181897/status-of-stdc-iec-559-with-modern-c-compilers

richardmembarth commented 2 years ago

f16 is not defined for __CUDACC_VER_MAJOR__ <= 8.

richardmembarth commented 2 years ago

I also argue to remove write function. This duplicates logic we already have and is only used in some places. It is not clear to the user why / when to use write over fmt.

michael-kenzel commented 2 years ago

I have revisited this and prepared another version of this PR based on the latest state of the codebase that should hopefully address all the previously raised issues.

AnyDSL / thorin

Fix PrimType mapping in C backend #120