chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.79k stars 420 forks source link

Feature request: Support for generic-function "static" local variables #12281

Open BryantLam opened 5 years ago

BryantLam commented 5 years ago

Feature request for C++-equivalent static variables (that live beyond the lifetime of one function call) in all functions, but specifically for generic functions. Today, a non-generic function can imitate a static variable by making a module-level private global variable and using that, but there's no such workaround for a generic function.

I want the equivalent of this behavior for this feature:

#include <iostream>

template <typename T>
void do_it(T x) {
  static long long xx = 0;
  xx += x;
  std::cout << "do_it(" << x << ") = " << xx << '\n';
}

int main() {
  short a = 5;
  int b = 2;

  do_it(a);
  do_it(a);
  do_it(b);
  do_it(b);
  do_it(a);
  do_it(b);
}
# Terminal output
do_it(5) = 5
do_it(5) = 10
do_it(2) = 2
do_it(2) = 4
do_it(5) = 15
do_it(2) = 6
bradcray commented 5 years ago

This is something I've also wanted for some time, but put off wrestling with after the largely negative reaction it caused the last time it came up:

https://sourceforge.net/p/chapel/mailman/message/35219406/ https://www.facebook.com/ChapelLanguage/posts/1761242420787583

Beyond the "leads to poor software engineering practice" concerns mentioned on these threads, there's also arguably a question about which locale such variables should live on... Always on locale 0? On the locale on which the function was first called? Or is there a copy of the variable per function instantantiation x locale?

(Note that I'm still game for such a feature, just pointing out that it didn't have many supporters the last time around).

BryantLam commented 5 years ago

Motivation

Even though we're not debating the merits of "poor software engineering practice"--in general, I agree that static/global variables are a bad programming practice when alternative methods are more maintainable. That said, there are coding paradigms that cannot be achieved effectively without generically instantiated static variables:

template <typename T>
void do_it() {
  static const T t = T();
}

In this case, T is a const, but it's created on a per-type basis. You could do this construction as a global variable, but that requires the library developer to anticipate all the types that this generic function could be called with, and then somehow have that generic function use the specific instance of the global variable for that type. As a consequence, if the construction of T was some expensive compute- or memory-expensive function, it would be wasteful to do it on types that weren't used. (In this case, the function would actually be a C++ template that the users would use, thus the template is instantiated on a per-type-used basis instead of being in some shared library.)

Anyway! Maybe there's some design space exploration into how to instantiate a type-specific global variable that depends on a type provided to a generic function. Static variables are at least a known quantity.

Locality

What do module global variables do today? It would be easy enough to just copy that behavior for consistency. I'm all about consistency.

bradcray commented 5 years ago

I buy that argument (not that I was among those needing convincing...).

W.r.t. the locality question, module-scope variables today are allocated on locale 0, and my intuition matches yours that we'd probably want to do the same thing here for consistency and simplicity (but was keeping quiet, not wanting to lead you in any specific direction...).

IIRC, static local variable initializers in C++ are never invoked if the containing function is never called. That is, in:

proc foo() {
  static var x = initx();
}

proc bar() {
  static var y = inity();
}

foo();

The function inity() would never be called since bar() was not. So I think the implementation here will need to be a bit more subtle than simply "move static variables to the module scope on a per-concrete function basis" (assuming we want to do the same thing, anyway).

Now for the hard choice: What to call such variables? static clearly has precedent, but I've had end-users accuse that term of being a compiler-/implementation-focused term rather than intuitive to laypeople. That's what caused me to kick off the thread and facebook post above, not anticipating that it would blow up in my face. Collecting some of the results there (and filtering ones that I think are obvious non-starters):

retain[ed] var x = 1;
maintain var x = 1;
preserve var x = 1;
persist[ent] var x = 1;
save var x = 1;
once var x = 1;
remain var x = 1;
global var x = 1;
conserve var x = 1;
sustain var x = 1;

Another proposal I unearthed suggested we do this by permitting local modules:

proc foo(x: int) {
  module Bar {
    var x = 1;
  }
}
bpr commented 5 years ago

The reason that the existing Chapel workaround mentioned at the top (module-level private global variable) fails to satisfy the requirement is that even with a generic do_it function, there is only one module level variable. If modules could be parameterized, we wouldn't have that issue.

module MyStatic(type T) {
    var xx : T;

    proc do_it(x: T) {
        xx += x;
        writef("do_it(%i) = %i\n", x, xx);
    }
}

The D language has a feature like this, where you can create a kind of templated namespace, and of course, the parameterized modules are an important feature in the ML family of languages.

mppf commented 5 years ago

Another possible direction would be to allow a generic variable that is instantiated per-type.

But, that'd probably be more confusing than having a module which could accept a type argument (and instantiate variables and functions inside with it).

Of the ideas so far, I find parameterized modules the most appealing.

bradcray commented 2 years ago

FWIW, I found myself wanting this (static local variables) yet again while working on Advent of Code 2021.

Thinking about the implementation a bit more:

So I think the implementation here will need to be a bit more subtle than simply "move static variables to the module scope on a per-concrete function basis" (assuming we want to do the same thing, anyway).

I found myself wondering "Couldn't we implement this simply by introducing a static local variable into the generated procedures themselves?" (i.e., with the C back-end, implement it using C's support for static local variables) Given that we stamp out instantiations of generic functions, this would also provide the "static per function" capability that Bryant wanted in the OP. However, it would also mean that each locale would have its own copy of the static which... may be more confusing than beneficial. An expensive "fix" would be to have any executions of a procedure with a local static to start with a compiler-introduced on Locales[0] and to define that they always run on locale 0 (perhaps giving a performance warning if the user calls them on other locales when compiling with a --performance-hints flag?).

damianmoz commented 1 year ago

I found myself trying to figure out the optimal way to define an array of all the primes up to 1024 for a tiny task to do with FFTs. It only needs procedure scope. I would be likely running this routine on multiple locales so I guess one needs a copy of that data on each of those locales. The overhead of grabbing them from the primary locale every time seems too high.

I did notice in this case that the original routine was generic, i.e. it was designed to handle 32-bit integers and 64-bit integers. I do not need 64-bit integers but I am sure that others might so I wrote the original C++ code to handle both. I can probably live without these tables having to live in generic routines.

Similar examples would be say polynomial coefficients within a function, e.g. the elementary mathematical functions. You need those available on every locale for the same reason. They are highly unlikely to be in generic routines.

damianmoz commented 1 year ago

I read a bit deeper and some of those other requests are talking about vars. Way more adventurous that what I am talking about.

All I was talking about is an array of params. That is, where everything is known (or computable at compile time. What C calls static const.

param x[] = { 1, 2, 3, 4, 5 };

although something like

param x[1..6] = { 1:real(32), 2:real(32), 3:real(32), 4:real(32), 5:real(32), 6:real(32) };

would have to be supported and multi-dimensional arrays (and if possible records) against comprised of only things known (or computable from routines which are themselves param) at compile time.

It would be useful.

bradcray commented 1 year ago

Hi @damianmoz : We would ultimately like to support param ranges, domains, and arrays, if not records, but are nowhere near that today due to the amount of effort required to teach the compiler how to compute on these types.

Though much more verbose, I believe that the only way you could get this param-ness today would be to support a routine that took a param and returned a param, along the lines of:

proc x(param i: int) param : int {
  select (i) {
    when 1 do return 1;
    when 2 do return 2;
    when 3 do return 3;
    when 4 do return 4;
    when 5 do return 5;
    when 6 do return 6;
    otherwise do compilerError("index out of bounds: x("+i:string+")");
  }
}

param p = x(4);
writeln(p);
writeln(isParam(p));

In any case, since you've clarified that you're looking for something different here, let's take this discussion to a new issue or Discourse thread to avoid bogging down this one.

damianmoz commented 1 year ago

OK. Will do. Thanks for the advice.