Open jkenda opened 7 months ago
You're right; it's important to optimize assembly code to avoid unnecessary instructions, especially in performance-critical areas like system call wrappers. Here's a more streamlined version of the GAS assembly code for a syscall that takes six arguments and adheres to your feedback:
.section .text
.global syscall_example
.type syscall_example, @function
syscall_example:
# Save registers that will be modified by the syscall
push %rcx # Save RCX as it will be clobbered
push %r11 # R11 is clobbered by syscall
# Prepare arguments
# Arguments 1-3 (RDI, RSI, RDX) are already correctly placed
mov %rcx, %r10 # Move Arg4 from RCX to R10
# Arguments 5 and 6 (R8, R9) are already correctly placed
# Load syscall number into RAX (hypothetical syscall number 400)
mov $400, %rax
# Make the syscall
syscall
# Restore clobbered registers
pop %r11
pop %rcx
# Return to caller, result from syscall will be in RAX
ret
Reduced Register Saving: Only RCX
and R11
are pushed to the stack before the syscall. These are the registers that the syscall
instruction itself might modify (other than the argument and return value registers). Other registers are now preserved, assuming they will not be modified by the syscall according to the ABI.
Eliminated Redundant Moves: Removed the redundant mov
instructions that simply moved register values into themselves. Now, only necessary moves are performed, such as moving RCX
to R10
to correctly place the fourth argument.
Direct Use of Argument Registers: Since registers RDI
, RSI
, RDX
, R8
, and R9
do not need to be modified if they already contain the correct arguments when the function is called, no unnecessary operations are performed on them.
This version of the syscall wrapper is more efficient and adheres to best practices for system-level programming in assembly, focusing on minimizing overhead and preserving only the necessary state. This approach is particularly important in a low-level programming context where performance and predictability are critical.
For syscall wrappers in both AArch64 and RV64 architectures, the essential idea is to save only the registers that are either used for passing syscall arguments (which might not need saving unless altered within the wrapper itself) or clobbered by the svc
(for ARM) or ecall
(for RISC-V) instructions.
x0-x18
are general-purpose registers that may be used for argument passing and can be clobbered. x19-x28
are callee-saved, meaning any function (including a syscall wrapper) that uses these must save and restore them if they are modified.x30
(link register) and sp
(stack pointer) are crucial for function return and stack integrity, respectively.For syscalls, typically x0-x7
are used for arguments, and the syscall number is placed in x8
. In practice, many syscalls do not use all these registers for passing arguments. It's common to only preserve x19-x29
, x30
, and stack pointer adjustments, depending on what your function modifies.
a0-a7
are used for arguments and return values from syscalls and are naturally overwritten. The ra
(return address) and s0-s11
(saved registers) need to be preserved if used.ecall
causes the environment call, which will modify a0
and a1
as return values and potentially other a
registers depending on the syscall.In your wrapper:
x19
to x28
unless your syscall wrapper specifically needs to use these for some computation or handling before or after the syscall.ra
and any used s
registers (like s0
) is necessary if they are used within your wrapper. If your wrapper directly passes arguments that come from the caller and does not use any s
registers or modify the return address, you could skip saving these.Here’s a more optimized version of syscall wrappers where unnecessary saves are avoided, based on typical usage:
AArch64 Optimized:
.global syscall_example
.type syscall_example, %function
syscall_example:
// Syscall number in x8
mov x8, #400 // Hypothetical syscall number
// Perform syscall
svc 0
ret // Return to caller, result will be in x0
RV64 Optimized:
.global syscall_example
.type syscall_example, @function
syscall_example:
// Syscall number in a7
li a7, 400 // Hypothetical syscall number
// Perform syscall
ecall
ret # Return to caller, result will be in a0
In these examples, no registers are explicitly saved/restored because the syscall itself will overwrite the argument registers, and if the wrapper doesn't use any additional registers, there's no need to save them. This makes the code simpler and more efficient but assumes that the caller handles register preservation if necessary.
For syscall wrappers in both AArch64 and RV64 architectures, the essential idea is to save only the registers that are either used for passing syscall arguments (which might not need saving unless altered within the wrapper itself) or clobbered by the svc
(for ARM) or ecall
(for RISC-V) instructions.
x0-x18
are general-purpose registers that may be used for argument passing and can be clobbered. x19-x28
are callee-saved, meaning any function (including a syscall wrapper) that uses these must save and restore them if they are modified.x30
(link register) and sp
(stack pointer) are crucial for function return and stack integrity, respectively.For syscalls, typically x0-x7
are used for arguments, and the syscall number is placed in x8
. In practice, many syscalls do not use all these registers for passing arguments. It's common to only preserve x19-x29
, x30
, and stack pointer adjustments, depending on what your function modifies.
a0-a7
are used for arguments and return values from syscalls and are naturally overwritten. The ra
(return address) and s0-s11
(saved registers) need to be preserved if used.ecall
causes the environment call, which will modify a0
and a1
as return values and potentially other a
registers depending on the syscall.In your wrapper:
x19
to x28
unless your syscall wrapper specifically needs to use these for some computation or handling before or after the syscall.ra
and any used s
registers (like s0
) is necessary if they are used within your wrapper. If your wrapper directly passes arguments that come from the caller and does not use any s
registers or modify the return address, you could skip saving these.Here’s a more optimized version of syscall wrappers where unnecessary saves are avoided, based on typical usage:
AArch64 Optimized:
.global syscall_example
.type syscall_example, %function
syscall_example:
# Syscall number in x8
mov x8, #400 # Hypothetical syscall number
# Perform syscall
svc 0
ret # Return to caller, result will be in x0
RV64 Optimized:
.global syscall_example
.type syscall_example, @function
syscall_example:
# Syscall number in a7
li a7, 400 # Hypothetical syscall number
# Perform syscall
ecall
ret # Return to caller, result will be in a0
In these examples, no registers are explicitly saved/restored because the syscall itself will overwrite the argument registers, and if the wrapper doesn't use any additional registers, there's no need to save them. This makes the code simpler and more efficient but assumes that the caller handles register preservation if necessary.
Below is a table summarizing ABI triples for various architectures and operating systems. This table includes typical configurations for the architectures and toolchains discussed:
Architecture | Vendor | Operating System | Environment | Common Triple |
---|---|---|---|---|
x86_64 | none | Windows | MSVC | x86_64-none-windows-msvc |
x86_64 | w64 | Windows | MinGW | x86_64-w64-mingw32 |
x86_64 | pc | Cygwin | GNU | x86_64-pc-cygwin |
x86_64 | none | Linux | GNU | x86_64-none-linux-gnu |
x86_64 | none | Linux | musl | x86_64-none-linux-musl |
AArch64 | none | Linux | GNU | aarch64-none-linux-gnu |
AArch64 | apple | macOS | - | aarch64-apple-darwin |
ARM | none | Linux | GNU EABI | arm-none-linux-gnueabi |
ARM | none | Linux | GNU EABI HF | arm-none-linux-gnueabihf |
RISC-V 64 | none | Linux | GNU | riscv64-none-linux-gnu |
RISC-V 64 | none | Bare-metal | GNU | riscv64-none-elf |
x86_64
for 64-bit x86 processors or AArch64
for ARM 64-bit CPUs.none
for generic or open systems, but can specify apple
for Apple-specific toolchains or w64
for Windows 64-bit with MinGW.windows
, linux
, or darwin
(macOS).gnu
for the GNU System V ABI or musl
for the musl libc. For Windows, msvc
denotes Microsoft Visual C++, while MinGW toolchains might use mingw32
.These ABI triples help configure compilers and toolchains to generate the correct code for a given target environment, ensuring compatibility and optimizing for specific system characteristics. They are crucial in cross-compilation scenarios where the development and target systems differ.
The backend can easily be platform-independent (at least on UNIX) if wrapper functions are created for each system call which can be swapped depending on the platform. This immediately discards Windows as a target platform but perhaps syscalls can somehow be emulated on it or a higher-level concept can be divised down the line.
Make a file with system call functions for every target so you don't have to link with libc (
-nostdlib
).https://filippo.io/linux-syscall-table/ https://opensource.apple.com/source/xnu/xnu-1504.3.12/bsd/kern/syscalls.master
See https://github.com/jkenda/qbe-example