stan-dev / stanc3

The Stan transpiler (from Stan to C++ and beyond).
BSD 3-Clause "New" or "Revised" License
140 stars 44 forks source link

Creating empty arrays with {} #1286

Closed spinkney closed 1 year ago

spinkney commented 1 year ago

Is it intended that {} can't be parsed as an empty arrray?

I was trying to pass an empty array into a recursive function with the first iteration and found that I needed to create it as array[0].

WardBrian commented 1 year ago

It’s definitely intentional in the sense that it’s what our parser encodes: https://github.com/stan-dev/stanc3/blob/c8bae8bd11eb8368cc26cff0383897408df154ba/src/frontend/parser.mly#L579

Now I’m not sure whether or not it’s necessary. My first guess is that you might get ambiguities between an empty array expression and an empty block, but I can try it out today and see.

nhuurre commented 1 year ago

I think the main problem is type inference. It would be weirdly limiting if the type of {} was always e.g. array[] int but guessing the type from context is not implemented and is sometimes ambiguous anyway.

functions {
  real f(array[,] real x) {
    return 1.0;
  }
  real f(array[] vector x) {
    return 0.0;
  }
  real g() {
    return f({}); // which overload is it?
  }
}
WardBrian commented 1 year ago

I think your right. We actually would disallow it in the typechecker even if it parsed, see https://github.com/stan-dev/stanc3/blob/c8bae8bd11eb8368cc26cff0383897408df154ba/src/frontend/Typechecker.ml#L317-L321

C++ allows empty initializer lists but only if they're unambiguous, e.g. https://godbolt.org/z/MdEKhsdMh

spinkney commented 1 year ago

Would it be possible to allow unambiguous null arrays then? Like int {}, real {}, etc.? Or maybe it's better to have {} int?

WardBrian commented 1 year ago

That is still ambiguous in the number of dimensions. How often is this something somebody would want to have?

spinkney commented 1 year ago

Wouldn't 2d be {,} int? Or am I missing something?

spinkney commented 1 year ago

That is still ambiguous in the number of dimensions. How often is this something somebody would want to have?

I just think it seems like you should be able to construct an empty array like you can a non-empty one. At the minimum, pointing out in the docs that empty arrays cannot be initialized like this seems necessary.

WardBrian commented 1 year ago

Huh, that would be a kind of expression altogether. I don't think that makes a ton of sense, since {1,2} is a 1-d thing, but { , } would be a 2-d thing?

You could write { {} }, but that's really an array[1, 0]. I think writing down a true array[0, 0] is functionally impossible without a specific syntax just for that.

Initializing an empty array is not necessary, since there is only one empty array of a given type, e.g. all array[0] ints are identical, and they're initialized by the compiler to the only value they can have.

spinkney commented 1 year ago

Good example and I wouldn't want a ton of work put into this because it wouldn't be something people do that often. There's also the easy work around of creating the empty array before using the array[0] declaration.

nhuurre commented 1 year ago

I think writing down a true array[0, 0] is functionally impossible without a specific syntax just for that.

You can write rep_array({1.0}, 0) or whatever.

bob-carpenter commented 1 year ago

At the minimum, pointing out in the docs that empty arrays cannot be initialized like this seems necessary.

Agreed. Where do you think that should go in the reference manual and/or user's guide? Would you mind opening a docs issue for this?

it seems like you should be able to construct an empty array like you can a non-empty one

It's easy to think that until you consider what the type should be. While it's tempting to assign {} to array[] int type, it could really be an empty array of anything.

The only solution now with an expression is what @nhuure suggests---just use the array constructors rep_array(). Or we could have typed empty array constructors of some kind, such as empty_int_array(), empty_real_array(), and so on. Super clunky, but easy to type. You can also define specific empty things in transformed data,

transformed data {
  array[0] int empty_int_array;
}
spinkney commented 1 year ago

Agreed. Where do you think that should go in the reference manual and/or user's guide? Would you mind opening a docs issue for this?

Opened a docs issue at https://github.com/stan-dev/docs/issues/621. I believe the relevant section is in the reference manual linked at the aforementioned docs issue.

WardBrian commented 1 year ago

I'm going to close this in favor of the docs issue you created @spinkney