nim-lang / RFCs

A repository for your Nim proposals.
137 stars 23 forks source link

add pragma for data segment variable allocations #453

Closed ct-clmsn closed 2 years ago

ct-clmsn commented 2 years ago

Abstract

Add a pragma for variables that provides data segment allocations C-style static keyword semantics.

Motivation

A variable in C and C++ prefixed with the 'static' keyword is added into the data segment of the program (Rust provides similar functionality). In high performance computing, numerical analysis, scientific computing, and embedded-device applications this feature can provide additional performance benefits; when a program is loaded, memory is allocated for the variable. The memory allocation is resident for the duration of the program. The static pragma would only be applicable to variables defined outside of the scope of a type definition. Ordinal types that are not enums, complex numbers, real number types, char, pointers/addresses, arrays would be permitted to use this pragma (potentially other types that are not currently allocated on the heap or dynamically allocated at program runtime by the nim code generator/compiler). String literals maybe outside the scope of this feature request depending on how string literals are implemented - if string literals are implemented as fixed size arrays of bytes this feature request would be applicable, if string literals are allocated on the heap then this feature request is not applicable.

The author would prefer the pragma {.allocStatic.} but {.allocDataSegment.} maybe more appropriate as it would not create the potential for users to conflate the pragma with the nim keyword static.

Description

The author cannot find a mechanism that currently provides this functionality in nim. The pragma would offer the following syntax:

var x : T {.allocStatic.} where T is one of the types enumerated in the 'Motivation' section.

Examples

var x : T ... {.allocStatic.}

var y : array[10, T] {.allocStatic.}

Before

Not applicable

After

Please review examples section above

Backward incompatibility

This request does not impact type definitions and only impacts variables with types that are currently allocated on the stack. This author does not currently believe there would be backwards compatibility issues. There maybe conflict with how reference counting works currently and potentially with garbage collection (GC) but, since this request is for variables allocated on the stack that should mitigate or marginalize the concern of an impact on the GC and reference counting implementations. The intent of this request is to provide basic functionality.

mratsim commented 2 years ago

Somewhat duplicate with addressable consts, progmem and rom: https://github.com/nim-lang/RFCs/issues/257#issuecomment-805816584 except that your use-case is possible today since your variables are non-consts.

If you want more specificity you can as of today use codegendecl, https://nim-lang.org/docs/manual.html#implementation-specific-pragmas-codegendecl-pragma

For instance

var a {.codegenDecl: "static $# $#".}: int

will generate static int a

A variable in C and C++ prefixed with the 'static' keyword is added into the data segment of the program (Rust provides similar functionality). In high performance computing, numerical analysis, scientific computing, and embedded-device applications this feature can provide additional performance benefits; when a program is loaded, memory is allocated for the variable. The memory allocation is resident for the duration of the program. The static pragma would only be applicable to variables defined outside of the scope of a type definition.

I'm confused about what you want here:

String literals maybe outside the scope of this feature request depending on how string literals are implemented - if string literals are implemented as fixed size arrays of bytes this feature request would be applicable, if string literals are allocated on the heap then this feature request is not applicable.

String literals are fixed sized array of chars.

Additionally while not specifically mentioned in your request, if you want local variables to be stored not on the stack but in specific location of the program memory and initialized at program startup, you can use the {.global.} pragma: https://nim-lang.org/docs/manual.html#pragmas-global-pragma

ct-clmsn commented 2 years ago

@mratsim - My understanding of the static keyword is consistent with the definition you provided; the phrasing is from prior experience working with different C compiler implementations and their associated optimizers.

Occasionally, the static keyword can be handled differently by a vendor or implementer; sometimes compilers will optimize in the direction of making a smaller executable size on disk which means the static memory is left to be allocated during program initialization. Sort of a 'late binding' situation. Users sometimes have to wrangle with compiler flags to avoid a storage/disk optimization.

As to the second comment, predefining static memory for typed communication buffers (uint8, int*, double, etc) can yield performance benefits in distributed communication settings. As you highlighted, the memory is available when the program on disk is loaded into memory (as the compiler created a physical allocation in the program for the buffer).

Thank you for the response, commentary, and for helping steer the clarification of this request!

The pragma you provided is what I was hoping to achieve:

var a {.codegenDecl: "static $# $#".}: int

As a new user, the $# syntax in the codegenDecl pragma was not something that seemed possible. Also your clarification of the language's handling of globally annotated variables was also helpful. Thanks! Consider this request is closed!