nim-lang / RFCs

A repository for your Nim proposals.
136 stars 23 forks source link

Support alignment at the type level #545

Open khaledh opened 11 months ago

khaledh commented 11 months ago

Abstract

Add support for the align pragma at the type level.

Motivation

Currently the {.align.} pragma applies only to either variables or object fields. Sometimes it is useful to request that all instances of a particular type be aligned. It's also impossible to align a ref type; the only way to workaround this is to define an aligned field within the type itself.

Description

The C standard allows aligning for types using the _Alignas type modifier (or alignas since C23)[0]. C++ also support the same alignas specifier[1]. This makes it easy to ensure that instances of those types are always aligned properly, and not have to worry about aligning them at instantiation time. It also makes it possible to define ref types that are aligned (currently not possible).

[0] https://en.cppreference.com/w/c/language/_Alignas [1] https://en.cppreference.com/w/cpp/language/alignas

Code Examples

This is the behaviour today:

type
  MyArr = ref array[4, int]
  MyObj = ref object
  MyArrObj = ref object
    arr {.align(256).}: array[4, int]

  var arr {.align(256).}: MyArr
  var obj {.align(256).}: MyObj
  var arrObj: MyArrObj

  arr = new MyArr         # not aligned (though the pointer variable itself is aligned)
  obj = new MyObj         # not aligned (though the pointer variable itself is aligned)
  arrObj = new MyArrObj   # arrObj.arr is aligned

Trying to align a type is an error:

type
  MyArr {.align(256).} = array[4, int]  # invalid pragma: align(256)
  MyArrRef = ref MyArr
  MyObj {.align(256).} = object          # invalid pragma: align(256)
  MyObjRef = ref MyObj2

The proposal calls for allowing the above example to work, allowing instances of the aligned type to be always aligned. The following examples should work:

type
  MyObj {.align(256).} = object

var obj = MyObj()       # obj should be aligned
type
  MyObj {.align(256).} = object
  MyObjRef = ref MyObj

var obj: MyObjRef
new(obj)                # the object `MyObj` pointed to by `obj` should be aligned

Backwards Compatibility

This should be backwards compatible.

Araq commented 11 months ago

Does this RFC also mean to affect new(x) and MyObjectConstr(fields...)`? It's not clear.

khaledh commented 11 months ago

Does this RFC also mean to affect new(x) and MyObjectConstr(fields...)? It's not clear.

Yes. I added more examples in the code examples section.

raw-bin commented 11 months ago

Apologies but this wasn’t clear to me: would this enable aligned heap allocations ? For example, if I want to declare an array that starts on the heap at an address with a desired alignment would that be possible ? I don’t think it currently is and that limits certain use cases like page allocators for programming CPU MMU page tables. Thanks.

mratsim commented 11 months ago

Apologies but this wasn’t clear to me: would this enable aligned heap allocations ? For example, if I want to declare an array that starts on the heap at an address with a desired alignment would that be possible ? I don’t think it currently is and that limits certain use cases like page allocators for programming CPU MMU page tables. Thanks.

You can already do that today, this is extra sugar to avoid having to declare aligned fields.

type MyArr = object
  raw {.align: 64.}: array[256, byte]

will be 64 byte aligned.

This PR would allow

type MyArr {.align: 64.} = array[256, byte]

as an alternative to enforce alignment.

Regarding your heap alignment, write your own heap-aligned allocator: https://github.com/mratsim/constantine/blob/777cf55/constantine/platforms/allocs.nim#L101-L134

when defined(windows):
  proc aligned_alloc_windows(size, alignment: int): pointer {.tags:[HeapAlloc],importc:"_aligned_malloc", header:"<malloc.h>".}
    # Beware of the arg order!
  proc aligned_alloc(alignment, size: int): pointer {.inline.} =
    aligned_alloc_windows(size, alignment)
  proc aligned_free(p: pointer){.tags:[HeapAlloc],importc:"_aligned_free", header:"<malloc.h>".}
elif defined(osx):
  proc posix_memalign(mem: var pointer, alignment, size: int){.tags:[HeapAlloc],importc, header:"<stdlib.h>".}
  proc aligned_alloc(alignment, size: int): pointer {.inline.} =
    posix_memalign(result, alignment, size)
  proc aligned_free(p: pointer) {.tags:[HeapAlloc], importc: "free", header: "<stdlib.h>".}
else:
  proc aligned_alloc(alignment, size: int): pointer {.tags:[HeapAlloc],importc, header:"<stdlib.h>".}
  proc aligned_free(p: pointer) {.tags:[HeapAlloc], importc: "free", header: "<stdlib.h>".}

proc isPowerOfTwo(n: int): bool {.inline.} =
  (n and (n - 1)) == 0 and (n != 0)

func roundNextMultipleOf(x: int, n: static int): int {.inline.} =
  ## Round the input to the next multiple of "n"
  when n.isPowerOfTwo():
    # n is a power of 2. (If compiler cannot prove that x>0 it does not make the optim)
    result = (x + n - 1) and not(n - 1)
  else:
    result = x.ceilDiv_vartime(n) * n

proc allocHeapAligned*(T: typedesc, alignment: static Natural): ptr T {.inline.} =
  # aligned_alloc requires allocating in multiple of the alignment.
  let # Cannot be static with bitfields. Workaround https://github.com/nim-lang/Nim/issues/19040
    size = sizeof(T)
    requiredMem = size.roundNextMultipleOf(alignment)

  cast[ptr T](aligned_alloc(alignment, requiredMem))

proc allocHeapArrayAligned*(T: typedesc, len: int, alignment: static Natural): ptr UncheckedArray[T] {.inline.} =
  # aligned_alloc requires allocating in multiple of the alignment.
  let
    size = sizeof(T) * len
    requiredMem = size.roundNextMultipleOf(alignment)

  cast[ptr UncheckedArray[T]](aligned_alloc(alignment, requiredMem))

proc freeHeapAligned*(p: pointer) {.inline.} =
  aligned_free(p)

but in general pages are already 4kB aligned so it's not needed if you call mmap or VirtualAlloc directly no?

And if not, you can follow what I do here to request 16kB alignment: https://github.com/mratsim/weave/blob/b6255af/weave/memory/memory_pools.nim#L232-L236 (poisonMemRegion and guardedMemRegion are hooks into AddressSanitizer)

raw-bin commented 11 months ago

Thanks for the comprehensive answer, appreciated!

My use case is a bare-metal run-time environment where I use picolibc's *alloc() implementations to access a linear physical link-time demarcated heap area.

For programming the MMU's page tables, I use malloc'd page sized chunks. What I wanted to do instead was use Nim's create but would get misaligned page sized blocks and couldn't figure out how to force the alignement.

I'll tinker with your suggestions above.

khaledh commented 11 months ago

My use case is similar, allocating page tables for x86_64. For early boot, I have a simple static array that I use as a bump-pointer based heap. I implement malloc, realloc, and free based on this heap, and compile with -d:useMalloc. Then I use the {.align(...).} pragma and let Nim do the hard work, which basically requests enough raw memory from my malloc and does the alignment itself (see memalloc.nim).

Here's my code if you're interested:

raw-bin commented 11 months ago

Super useful! Thank you!