Port Hexagon-specific compiler-rt routines to Zig

alexrp commented 2 weeks ago

To complete our Hexagon support, we will need to port the Hexagon-specific compiler-rt routines to naked functions in Zig: https://github.com/llvm/llvm-project/tree/6c25604df2f669a0403a17dbdbe5c081db1e80a1/compiler-rt/lib/builtins/hexagon

For some of these, as a stopgap, we may be able to get away with using the generic routines we already have and just exporting them with the Hexagon-specific names. That may even be preferable unless we have strong evidence that the hand-written routines are significantly better.

Here is the actual usage in LLVM:

As far as I can see, ~all but the memcpy helper~ they all use the regular C calling convention.

androm3da commented 1 week ago

For some of these, as a stopgap

Does zig have an equivalent to -fno-builtins that might be another stopgap worth considering?

Is it useful to link against the clangrt library from the C/C++ toolchain or do most/all architectures for Zig have these built by/for zig itself?

alexrp commented 1 week ago

Does zig have an equivalent to -fno-builtins that might be another stopgap worth considering?

Not at the moment. But we could just make our LLVM backend set that flag for Hexagon specifically. I'm unsure if that's sufficient to make LLVM stop emitting these libcalls though.

Is it useful to link against the clangrt library from the C/C++ toolchain or do most/all architectures for Zig have these built by/for zig itself?

As a rule, the Zig toolchain has to be completely self-contained except in cases where that's outright impossible (think *-windows-msvc). This is a requirement for our ability to cross-compile to most targets out of the box, and is one of the reasons why we maintain our own compiler-rt implementation: https://github.com/ziglang/zig/tree/master/lib/compiler_rt

androm3da commented 1 week ago

As a rule, the Zig toolchain has to be completely self-contained except in cases where that's outright impossible (think *-windows-msvc). This is a requirement for our ability to cross-compile to most targets out of the box, and is one of the reasons why we maintain our own compiler-rt implementation: https://github.com/ziglang/zig/tree/master/lib/compiler_rt

Yeah that makes sense. We did contribute a similar item for rust not long ago. Though we kinda cheated there and just used a thin wrapper around the assembly. If we need to create zig implementations of these builtins with inline asm that might take a bit more doing.

alexrp commented 1 week ago

It wouldn't actually be terribly hard since Zig does support naked functions. So you basically just preprocess the assembly files and then paste the resulting assembly into an asm volatile expression in a naked function with the right name, and then @export() it. It's just a bunch of boring grunt work, basically.

androm3da commented 1 week ago

Yeah - I think my initial contribution for Rust looked more like that version, actually. For a couple of these algorithms, you lose a little bit of maintainability by taking the preprocessor output. But it's doable.

I don't yet have permission to contribute to zig but I'll make the request and see how it goes.

androm3da commented 1 week ago

Here's the script I used to automate the "boring grunt work" for rust. Under review, this approach was rejected for rust's compiler-builtins. But if it suits zig, maybe this is a good starting point.

#!/usr/bin/env python

import re
import sys
from glob import glob
from pprint import pprint

file_text = '''
#![cfg(not(feature = "no-asm"))]
#![allow(unused_imports)]
#![allow(named_asm_labels)]

use core::intrinsics;

intrinsics! {
'''
#DEFS_PAT = re.compile(r'^\s*#define\s+(?P<val>\S+)\s+(?P<repl>\S+)')
DEFS_PAT = re.compile(r'#define\s*(?P<val>\S+)\s*(?P<repl>\S+)')
CPP_INCL_PAT = re.compile(r'\s*#\s*\d+\s*\".*\"')

def get_defs(contents):
    for def_ in DEFS_PAT.finditer(contents):
        gr = def_.groups()
        if '(' in gr[0]:
            continue
        yield gr[0], gr[1]

from subprocess import Popen, PIPE
import shlex

def xform_to_inline_asm(func_name, text, defs = None):
    if defs:
        print('defs', filename)
        pprint(defs)
        for def_val, repl in defs.items():
            pat = re.compile(r'\b' + def_val + '\b')
            text = pat.sub(repl, text)

    text = text.replace('{', '{{').replace('}', '}}')
    text = text.replace('"', r'\"')
    text = '\n'.join(f'    "{line}\\n",' for line in text.split('\n'))

    extra = r'''
".Lmemcpy_call:\n",
"jump memcpy@PLT\n",''' if func_name and 'likely_aligned_min32bytes_mult8bytes' in func_name else ''

    func_name = func_name if func_name else '__tbd'
    return f'''#[naked]
    pub unsafe extern "C" fn {func_name}() {{
        core::arch::asm!(
        {text}{extra}
        options(noreturn)
        );
    }}'''

PUB_LABEL_PAT = re.compile(r'\s*([^\.][A-Za-z0-9\._]+)\s*:\s*')
#PUB_LABEL_PAT = re.compile(r'^\s*(\S+)\s*:\s*')

def get_asm(dirname):
    for filename in glob(dirname + '*.S'):
        func = re.compile(r'^FUNCTION_BEGIN\s*(?P<func_name>\S+)$(?P<func_body>.*?)^FUNCTION_END', re.MULTILINE | re.DOTALL)
#       text = open(filename, 'rt').read()
        p = Popen(shlex.split(f'cpp {filename}'), stdout=PIPE)
        text = p.communicate()[0]
        text = text.decode('utf-8')
        text = '\n'.join(l for l in text.splitlines() if not CPP_INCL_PAT.search(l))
#       defs = dict(get_defs(text))
        defs = None

        matches = func.findall(text)
        if len(matches) > 1:
            print('too many: guessing',filename)
#           yield False, xform_to_inline_asm(None, text.strip(), defs)
            continue
        else:
            print('matches:', len(matches), filename)
        m = func.search(text)
        if not m:
            print('guessing', filename)
            l = PUB_LABEL_PAT.search(text)
            label = '__tbd'
            if l:
                label = l.groups()[0]
                print('\tfound', label)
            yield False, xform_to_inline_asm(label, text.strip(), defs)
            continue
#           raise Exception('oh no!')
        gr = m.groupdict()

        func_name = gr['func_name']
        func_text = gr['func_body'].strip()

        yield True, xform_to_inline_asm(func_name, func_text, defs)

if __name__ == '__main__':
#   print('new text')
#   print(new_text)
#   p = Popen(shlex.split(f'rustfmt {filename}'), stdout=PIPE)
    funcs = list(get_asm(sys.argv[1]))
    goodfuncs = [f for good, f in funcs if good]
    with open('src/hexagon.rs', 'wt') as f:
        f.write(file_text)
        for func in goodfuncs:
            f.write(func)
            f.write('\n\n')
        f.write('}\n')
    badfuncs = [f for good, f in funcs if not good]
    with open('src/hexagon_.rs', 'wt') as f:
        f.write(file_text)
        for func in badfuncs:
            f.write(func)
            f.write('\n\n')
        f.write('}\n')

ziglang / zig

Port Hexagon-specific compiler-rt routines to Zig #21579