When should array arguments w/ unspecified domains get a chpl_opaque_array equivalent?

Summary of Problem

With #12105, array arguments in exported functions with unspecified domain will always get converted into taking chpl_external_arrays. However, in a generated library where only chpl_opaque_arrays get returned, is this still reasonable?

It's possible to get chpl_external_arrays from sources other than calling exported functions (there are runtime functions that can be called to make chpl_external_arrays in client code), so it's perfectly possible for a library that never returns a chpl_external_array to still want it as an argument in these cases. However, if only chpl_opaque_arrays are returned, the user might have intended to take a chpl_opaque_array instead.

It seems like potentially a warning should be generated when compiling code like this, at least. Since the user may have actually intended the behavior, the warning should be able to be turned off at compilation time.

We could also consider forcing the function argument to match the other function's return type when only a single array type is returned using chpl_opaque_array, but that will be more work (and potentially the wrong decision in some cases, such as if there are other functions that took in chpl_opaque_array and this was intended to be the chpl_external_array one).

Steps to Reproduce

Source Code:

use BlockDist;
const D = {1..5} dmapped Block({1..5});

export proc makeBlockArray(): [D] int {
  var x: [D] int;
  return x;
}

// Will generate a different signature than the return type for the previous
// function
export proc printBlock(x: [] int) {
  var output = "";
  var first = true;
  for idx in x.dom {
    if (first) {
      first = false;
    } else {
      output += ", ";
    }
    output += idx: string + "=";
    output += x[idx]: string;
  }
  writeln(output);
}

with client code:

#include <inttypes.h>

#include "lib/foo.h"

// Test of calling an exported function that takes an array
int main(int argc, char* argv[]) {
  // Initialize the Chapel runtime and standard modules
  chpl_library_init(argc, argv);
  chpl__init_foo(0, 0);

  chpl_opaque_array arr = makeBlockArray();
  printBlock(&arr); // Expected to fail today due to expecting chpl_external_array

  chpl_library_finalize();
  return 0;
}

Compile command: chpl --library foo.chpl and the big command for compiling C code with a Chapel library file.

Execution command: ./callFoo

Associated Future Test(s): Kinda test/interop/C/exportArray/arrayOpaquePointer_noArgDom.chpl and test/interop/C/exportArray/callArrayOpaquePointer_noArgDom.test.c #12105 in conjunction

Configuration Information

Output of chpl --version: 1.19.0 (pre-release)

I think we should do the following:

make chpl_opaque_array stored within chpl_external_array (or vice versa) (so that which of the fields are active changes at runtime; might have pointer to data or might have a pointer to instance / pid / etc)
adjust the compiler to instantiate extern proc functions with generic array arguments with:
- defaultrectangular (to cover NumPy cases)
- any array type that is returned by the program in an export proc (that eventually returns chpl_opaque_array) and then to generate a table to call the right one based on the runtime type of the instance stored in chpl_opaque_array.

As I've confessed to Lydia on chat, I don't really understand the current role of chpl_external_array for the Python programmer. With that caveat in mind, here are my thoughts with a fair amount of background detail and assumptions:

Starting with the obvious: For an exported Chapel function, depending on the language that the library is being exported for, some arrays can be represented "natively" while others can't. For example, a 2D local array in Chapel could be represented as a NumPy array if we were creating a Python library or a Fortran array if we were creating a Fortran library (in both cases, potentially with an "if the indices were appropriate" caveat), and vice-versa. But (IMO) the 2D array couldn't/shouldn't be represented as a C array if we were creating a C library (though a 1D local array could be). Other cases, like a Chapel Block-distributed array pretty clearly don't have a direct analog in these languages, particularly when they're running on just a single node, so need to remain opaque. [Tagging @daviditen here due to the mention of, and relationship to, Fortran array interop.]

I think the most natural approach to handling cases in which a Chapel array can be represented natively in an external language (or vice-versa) is to have the Chapel compiler generate the glue code necessary to convert between the NumPy array / Fortran array / C array and the corresponding DefaultRectangularArr in Chapel. In general, I think of this as converting an exported Chapel routine like this one:

export proc foo(X: [1..3] real): [1..3] real {
  return computationWith(X);
}

into two C routines in the generated code: One intended for use by the external language that takes care of converting between the external type(s) and the Chapel type(s):

array_type_in_external_language foo(array_type_in_external_language X) {
  _array_DefaultRectangularArr_... X_chpl = externalArrayToChapelArray(X);
  _array_DefaultRectangularArr_... ret_chpl;
  foo_chpl(X_chpl, &ret_chpl);
  array_type_in_external_language ret = ChapelArrayToExternalArray(ret_chpl);
  return ret;
}

and a second that implements the normal Chapel routine which is called by the first routine above. It would also be called by any calls to foo() from within Chapel itself:

void foo_chpl(_array_DefaultRectangularArr... X_chpl, _array_DefaultRectangularArr* ret_chpl) {
  computationWith(X_chpl);
}

For an array type that needs to be opaque, we'd generate similar routines, but ones in which the conversion calls would essentially do whatever we need to do to pass the Chapel array back and forth to the external language (e.g., unwrap / rewrap the _array wrapper?).

But of course, everything I've written so far assumes that we know the type of the array being passed to an exported Chapel function, when this issue is really focused on the case when the domain isn't specified or known (making it hard for the compiler to tell which case the argument being passed in falls into).

It seems to me that we don't want (or need) to be particularly flexible w.r.t. what can be passed into an exported routine because we'll need to clone all code that it calls for all cases that might be called without really knowing. For example, if in my example above, I'd written X: [] real to mean that Fortran could pass in a 1D array or a 2D, 3D, 4D, 5D, 6D, 7D array or a Chapel block-distributed array or ... then we'd need to clone computationWith() and everything that it calls for all of these cases even if only one of them was ever used. I view this as being similar to our lack of support for generic arguments in exported routines (i.e., you can't write export proc foo(x) { ... } because we can't / don't want to anticipate all possible types of x that might be passed in).

For this reason, I've been thinking that we should consider adding support for syntax like X: [,] real, X: [,,] real as a means of saying "2D array", "3D array". I've been imagining that for exported procedures, such cases would suggest passing the natively supported array types (NumPy arrays or Fortran arrays, say). A minor downside to this approach is that X: [] real starts to suggest "1D arrays only" where in Chapel it typically means "any dimensional array." But that seems acceptable in this exported context.

If we went with the above, it suggests that in order to accept a non-native array type (a block-distributed array, say), the user has to be more explicit about the domain (or domain type) expected from the argument. In Lydia's example above it means that I'd have to write export proc printBlock(x: [D] int) { if I wanted printBlock() to take in a Block-distributed array. While this is somewhat of a hassle, it seems reasonable to me to avoid creating too many instantiations of all downstream code and consistent with the typical requirement that exported routines not be generic.

We could go one step further and say that we don't support cases like X: [,] real either and that the user would have to describe the indices expected from the array (e.g., X:[1..n, 1..n] real but this seems a little unfortunate given that the array size isn't part of its static type in Chapel and so doesn't strictly need to be known. It also seems unfortunately limiting since it seems like a common case to export a routine that takes a 1D array of unknown size.

One other thing we could consider would be to permit where clauses to be applied to exported procedures as a way of telling the compiler what to expect. So, for example, perhaps rather than saying that printBlock() accepts x: [D] int, maybe I could say it accepts x: [?D] int where D is a 2D Block-distributed domain (or x: [] int where x.domain is...?). I'm not crazy about this because it feels like a different use of where clauses (they imply things more proactively to formal types rather than being used as a filter), and it also feels more complex to implement for that reason. But if requiring opaque arrays' domains to always be specified ended up seeming too onerous, it could give us some more flexibility while still being reasonably consistent in the language's definition I think?

So I think the summary of my thoughts here is:

err towards requiring domains to be more specified for export routines than less-specified
when underspecified, assume that an array argument is one that can be represented natively
introduce syntax that's required in the exported case to distinguish 1D from 2D from 3D...

I think the main downsides of this approach are:

formal array types have a somewhat different, and more limited, interpretation when called from external languages than when called from within Chapel (but I think this is acceptable given that interop is sortof a constrained case anyway)
users have to be more explicit when passing opaque array types in and out of Chapel

If we went with the above, it suggests that in order to accept a non-native array type (a block-distributed array, say), the user has to be more explicit about the domain (or domain type) expected from the argument.

We could go one step further and say that we don't support cases like X: [,] real either and that the user would have to describe the indices expected from the array (e.g., X:[1..n, 1..n] real but this seems a little unfortunate given that the array size isn't part of its static type in Chapel and so doesn't strictly need to be known.

Here's where the issue lies for me. I think that the runtime relationship between arrays and domains can be really awkward to deal with if it must be explicitly indicated. In particular if you have export proc printBlock(x: [D] int), then now aren't we artificially making the requirement that D be a global variable? Isn't that limiting what reasonable export procs we can create? In Chapel code, it could be an argument, but we don't have support for passing domains around to exported procs right now anyway? While the array-domain relationship can be really cool and useful, programmers coming from other languages tend to find it surprising, and so I'd really prefer it if we have a solution that doesn't require them to use (or even understand) it.

Sure, we could argue that the export proc should specify the compile-time type of the domain, but this is awkward, because it's almost never done in actual Chapel code (because as long as runtime types exist, we want the runtime information to "flow through" as well). That means in particular that 1) it's likely to be full of bugs in the implementation and 2) it's unlikely users will already know/remember exactly what the code they have to write. A further issue is that in practice, our Chapel library code tends to be a little bit loose with adding or moving type/param fields in library types. What is the type constructor for a Block domain? Are we sure that it will remain working exactly the way we've written it? (And IIRC our initializers story doesn't allow for creating a compatability type constructor overload).

Besides all of that, it concerns me when a programmer needs to create O(n*n) or more code. In this case, we might imagine a library author that intends to support "all the distributions" in their Chapel code for (distributed) matrix multiplication. Now, if we use the "types must be explicit", then for every function in their library they have to make a different version for each distribution, don't they? And, in some use cases, the author of one module might not know that a particular distribution needs to be used. Consider for example somebody combining an FFT module with a user-defined distribution module. The Chapel compiler can know that in the library combining these modules which distributions are "used" but it's unreasonable to expect that the author of the FFT module would know that. These problems seem pretty severe to me.

It is for these reasons, in addition to general "convenience", that I lean towards creating a mechanism where the compiler does in fact treat a function like export proc printArray(x: [] int) as generic. I think that this mechanism can be practical.

When I was recently discussing use cases for chpl_opaque_array with @lydia-duncan, she pointed out that the expected use case (today, at least) is that the Chapel library has exactly 1 array type and several functions that operate on that type. Let's presume that the library produces a Chapel array and returns it as a chpl_opaque_array in order for the other language (Python/C) to ever get to that type. This led me to think, OK, if there is 1 array type, the compiler could know what that array type is, because it's returned by an export proc. Then it could assume that the generic array arguments were that type.

Of course that stops working as soon as the library wants to work with 2 array types. So, could the idea be extended to handle 2 array types? Yes, if the library has different exported functions returning each of the 2 array types, then the compiler can instantiate the generic arguments with each of these.

I'd personally additionally assume that the compiler need only choose 1 array argument type per function, to limit the exponential growth in instantiations for functions with multiple generic array arguments.

So, that means in the printBlockArray example, we'd have, say:

createBlockArray
createCyclicArray
printArray than has a generic argument and the compiler instantiates for Block and Cyclic

Since the library already mentions Block and Cyclic arrays, it seems reasonable to instantiate printArray with Block and Cyclic. Additionally it seems reasonable that printArray not be instantiated with (say) Dimensional, because there's no mention of that in the library (where would it come from?). If we can create a chpl_external_array from a NumPy array then it'd additionally be reasonable to instantiate with DefaultRectangular to handle those cases.

Another approach to controlling these instantiations would be to expect some (possibly never executed but perhaps exported) Chapel code to call the generic function with whatever instantiations are desired to be in the library. I would also prefer this over the strategy of requiring the domain / domain type arguments in exported procedures. This variant has the advantage that it can apply to all sorts of generic Chapel code - not just array arguments. It still solves the problem for my FFT module + user-defined distribution example - but the author of the library would need to create some code specific to their use case to exercise the routines / create the instantiations.

Anyway these proposals for instantiations seem to me to be noticeably easier to use (than export proc f(arr: [SomeDomain] real)) while still being practical to implement.

Besides all of that, I view any instantiation strategy here as a stopgap until we have an interpreter and then can actually add instantiations to a running program. Such a capability will make it much easier to integrate with dynamic things like Python in a seamless way. I view adding instantiations to the running program as the principled solution in this area.

I'm a little concerned about this discussion tackling what feels like a specialized usage of generic functions, though I think that's intrinsic to the problem. I'd like us to approach this while thinking about what the eventual decision would imply about exporting regular generic functions in the future, so we don't make something that is inconsistent in that regard and add a lot of additional work down the road.

I feel differently about domain arguments than Michael in this case. To me, supporting this via a domain argument that is used for the array is the most natural solution (but it would be best solved imo by pushing on exporting types, and not having support for exported types or global variables has kinda forced us into this situation). I disagree with his point that we should avoid relying on domain arguments because it is a stumbling block for new users - to me, new users will have to learn about domains fairly quickly if they're going to make full use of our arrays and Chapel itself, and pushing it under the rug just hides one of the powerful tools that distinguishes us from other languages.

I don't think users will be as concerned about arrays being limited, since all arguments are limited today due to not supporting generic arguments. If we added generic argument support and weren't intending to support arrays like this then they would be more concerned, but that seems unlikely.

A further issue is that in practice, our Chapel library code tends to be a little bit loose with adding or moving type/param fields in library types. What is the type constructor for a Block domain? Are we sure that it will remain working exactly the way we've written it? (And IIRC our initializers story doesn't allow for creating a compatability type constructor overload).

I don't understand this argument, can you go into it more? Is it because type and param arguments are folded into the type during library compilation, so the user is going to have to explicitly choose whether to use _domain_DefaultRectangularDom_rank1_* or _domain_DefaultRectangular_rank2_* as the type? That probably does throw my leaning on domains out of whack . . .

Side note: Brad's array examples are all using 1 as the start index, but our external arrays don't support that today, they support 0's based indexing. I think this exemplifies a potential stumbling block for forcing the user to explicitly write the domain, especially if the user is familiar with Chapel

I feel differently about domain arguments than Michael in this case. To me, supporting this via a domain argument that is used for the array is the most natural solution.... I disagree with his point that we should avoid relying on domain arguments because it is a stumbling block for new users...

Here I think we'll need to agree to disagree. At the very least hopefully you'll agree with me that to write a function processing an array - with no array resizing anywhere - a new Chapel programmer can be totally unaware of the array domain relationship. (We may disagree about whether or not that's enough Chapel knowledge to get something real done).

A further issue is that in practice, our Chapel library code tends to be a little bit loose with adding or moving type/param fields in library types.

I don't understand this argument, can you go into it more?

Let's ignore the C/Python side of things for a minute to talk about this.

Say we have this program, which uses the domain is a global variable strategy:

use BlockDist;
const D = {1..5} dmapped Block({1..5});
var A:[D] int;
printBlock(A);

export proc printBlock(A: [D] int) {
  writeln(A[1]);
}

OK. We know that can work, but has some drawbacks. Let's try converting it to use a only domain type is known strategy. Pretending I'm a new user, I start with this:

use BlockDist;
const D = {1..5} dmapped Block({1..5});
var A:[D] int;
printBlock(A);

export proc printBlock(A: [Block] int) {
  writeln(A.type:string);
  writeln(A[1]);
}

$ chpl ll.chpl 
ll.chpl:6: In function 'printBlock':
ll.chpl:6: internal error: the type of the actual argument 'Block' is generic [resolution/callInfo.cpp:124]

Ugh. Ok, so maybe the BlockDist documentation describes invoking the type constructor and has an example? Nope, https://chapel-lang.org/docs/modules/dists/BlockDist.html does not do that.

OK, so maybe I look at the source code to come up with this:

use BlockDist;
const D = {1..5} dmapped Block({1..5});
var A:[D] int;
printBlock(A);

proc printBlock(A: [BlockDom(sparseLayoutType=unmanaged DefaultDist, rank=1, idxType=int, stridable=false)] int) {
  writeln(A.type:string);
  writeln(A[1]);
}

How is the user supposed to know they need to use BlockDom? How do they now what the required arguments are? How do they know what DefaultDist is? And, how do we know that in the future we won't add another argument like sparseLayoutType? Do we have to never change type constructors for any distributions? Never add param/type fields? Aren't all these details supposed to be implementation details rather than part of the API we provide to users?

So, this type strategy seems totally unworkable to me for software engineering reasons. And, it doesn't work now, for compiler/module implementation reasons:

ll.chpl:6: In function 'printBlock':
ll.chpl:6: error: unresolved call 'chpl__ensureDomainExpr(type BlockDom(1,int(64),false,unmanaged DefaultDist))'
ll.chpl:6: note:   instantiated from printBlock

What if we try to pass the domain as a separate argument? I still don't like that (see the first part of this comment) but let's just sketch it out for the sake of discussion.

use BlockDist;
const D = {1..5} dmapped Block({1..5});
var A:[D] int;
printBlock(A.domain, A);

proc printBlock(D: _domain(unmanaged BlockDom(sparseLayoutType=unmanaged
        DefaultDist, rank=1, idxType=int, stridable=false)), A: [D] int) {
  writeln(A.type:string);
  writeln(A[1]);
}

This program doesn't compile either. But even if we did get it (or some variant on it) to compile, it has all of the software engineering problems of the type-only solution that I covered above.

Ah, I see. You're right, that is really unfortunate. The only way I can see getting around it today still uses a global variable (I verified it compiles and runs):

use BlockDist;
const D = {1..5} dmapped Block({1..5});
var A:[D] int;
printBlock(D, A);

// Note: obvs can't be exported because no domain args yet
proc printBlock(argD: D.type, A: [argD] int) {
  writeln(A.type:string);
  writeln(A[1]);
}

It feels like maybe that's a separate question, though. Our type declarations for domains have always felt scary and unclear to me, and typically the way I've gotten around that is by just not declaring the type. We rely so much on generics and type inference that it hasn't been a problem, but it is here when we have to work with concrete types. If we're going to work incrementally towards solving this problem (either today or in the future), this seems like something that would be useful on the way. If either of you feel that's also valuable, I'll open an issue on that (with the expectation that we probably won't prioritize it right away)

Either way, the solutions proposed are going to be a lot of steps, especially with no one asking for the feature at the moment (besides me for completeness, but I don't count :p). Brad mentioned offline that he was leaning towards back-burnering it and I'm tempted agree, but will probably do something during the next sprint cycle to hide the related issue in #12135 that Ben A discovered

Brad mentioned offline that he was leaning towards back-burnering it

Just to put some more meat on this: The goal of this overall effort is to give Python programmers access to distributed arrays, but currently we don't have a way to do that because we don't have a way of firing up a multi-locale instance of a Chapel library. To that end, I was advocating that since there hasn't been obvious agreement on this issue, we table it until we get over that hurdle. And maybe the related hurdle of having the Python program running on a login node and the Chapel instances running on compute nodes. Having some way, even an awkward one, of passing truly distributed block arrays to / from Python seems strictly more valuable than determining how strict or relaxed the export declarations need to be to do so.

Right, that's absolutely fine with me. I understood this issue as a discussion of "what should we do (eventually)". I'm glad we had the discussion on this element though - I think it's important to have a general direction (or several possible directions) for the Python interop improvements, even as we build towards specific features and put off other ones.

chapel-lang / chapel