Open chalcolith opened 1 year ago
I would like to see the compile time expressions from Luke's work (or something like them) make it into Pony. Would this syntax that builds on it conflict with the work was in Ponyta? If we decided to bring over something like what was in Ponyta, would we be breaking the change in this RFC?
Luke's work uses syntax like #(1+2)
to denote compile-time expressions. I thought that #([1;2])
seemed noisy, which is why I suggest #[1;2]
. Bringing in Luke's work would not break anything in this PR.
I would only be in favor of this if it is a stepping stone to compile time expressions and would want the RFC written as such. Given that, I would probably want to see this written as a compile time expression RFC with it stated what from it can be added independently of one another and that we are ok with "groups" going in at the same time without all features.
Luke's work uses syntax like
#(1+2)
to denote compile-time expressions. I thought that#([1;2])
seemed noisy, which is why I suggest#[1;2]
. Bringing in Luke's work would not break anything in this PR.
I'm not really concerned with noisy. This seems like a compile time expression and so having an idea with how we would do compile time expression and making this fit is a very good idea. If we add #()
later, it feels like we wouldn't want an entirely separate mechanism for static arrays which I think fall under "compile time expression".
Do you see this feature as a subset of compile time expressions or independent?
It's a very minor subset of compile-time expressions. Luke's work involves implementing an interpreter in the compiler that can evaluate expressions (including constructing objects and calling their methods). This PR only involves storing tables a la string literals. Further compile-time expression work can use this PR's work without change.
I think it is important for this RFC for why this isn't "just an optimization" like String literals.
If it isn't "just an optimization", what happens if someone tries to declare an iso
or ref
etc static array? There seem to be a lot of edge cases with this approach vs "just an optimization".
I'd like this see the reasoning for this approach over an optimization (which is what I consider "string literals" as you mention them to be).
It's a very minor subset of compile-time expressions. Luke's work involves implementing an interpreter in the compiler that can evaluate expressions (including constructing objects and calling their methods). This PR only involves storing tables a la string literals. Further compile-time expression work can use this PR's work without change.
If this is a subset of compile time expressions and not an optimization, then I think that what compile time expressions would look like needs to be part of this PR. Without doing the work in the RFC, I don't think it is safe to say that this will have no impact on an as yet unspecified compile time expression support.
The "Alternatives" section addresses the "just optimization" option. If I'm implementing an algorithm that requires static data for performance reasons, I want to explicitly know that it's going to be static.
The design section explicitly says that compile-time expressions are denoted by #
in Luke's work. What sort of extra discussion would you like to see?
If someone tries to declare a ref
or iso
static array, the compiler will tell them they can't do that. If we make this "just an optimization", their code will succeed and then their algorithm will suffer performance issues.
If someone tries to declare a
ref
oriso
static array, the compiler will tell them they can't do that. If we make this "just an optimization", their code will succeed and then their algorithm will suffer performance issues.
The RFC is lacking in details such as when this feature will succeed, when it won't etc. Those need to be detailed in the RFC.
Also the trade-offs between optimization vs adding new syntax need to be covered. There appears to be a real "simplicity" vs "i know it was done" tradeoff here. I'm not sure I agree with your "just an optimization will succeed", I mean, yes, it will but its a ref
array so that feels like a teaching moment in documentation not a requirement for new syntax.
Having to do:
let y: Array[U8] val = recover val [1;2] end
seems a lot more straightforward.
It's a little different than a String in that Array
defaults to ref
rather than val
but otherwise, this seems nice and straightforward. And all existing code where this optimization could be done, would be done.
At the moment, I'm against new syntax for this. I find the "they need to get an error" if the optimization won't be applied argument to be not particularly good in terms of tradeoffs.
The "Alternatives" section addresses the "just optimization" option. If I'm implementing an algorithm that requires static data for performance reasons, I want to explicitly know that it's going to be static.
I had to reread that several times to see how that is the case. It sounds like it is talking about optimizing a new type. I think it both alternatives need to be expanded on.
I don't think editorializing like "magical" should be in the alternatives. I think laying out pros and cons of alternatives should be done.
I'd like the RFC to address why "as an optimization" is fine for String literals but not Array literals.
I'm not sure I understand what "when this feature will succeed and when it won't" means. The RFC says "The value of a static array literal can only be Array[T] val
, where T
is a floating-point or integral numeric type."
To make sure I am reading this correctly, this would only be for arrays of numbers, I couldn't use this for arrays of String literals or other types in the future?
Is having an annotation that will cause a compiler error if an optimization isn't applied an alternative?
I'm not sure I understand what "when this feature will succeed and when it won't" means. The RFC says "The value of a static array literal can only be
Array[T] val
, whereT
is a floating-point or integral numeric type."
The RFC does not detail what happens for other cases. We are expecting compiler errors? That say what? This can only be for val
, so how does this work with something that "becomes a val"? What's the analysis that will need to be done. "Only a val" feels very loose-y goose-y to me in terms of working through the implications of this.
"The value of a static array literal can only be
Array[T] val
, whereT
is a floating-point or integral numeric type."
What specifically would the check be that the compiler does? It will need to have knowledge of the standard library. What's the check we would do? Array[Number]?
If you tried to do let foo: Array[U32] ref = #[1;2]
the compiler would say "right side must be a subtype of left side; Array[U32 val] val is not a subtype of Array[U32 val] ref^".
The compiler would check that T
is one of { F32
, F64
, ISize
, ILong
, I8
, I16
, I32
, I64
, I128
, USize
, ULong
, U8
, U16
, U32
, U64
, U128
}.
If you tried to do
let foo: Array[U32] ref = #[1;2]
the compiler would say "right side must be a subtype of left side; Array[U32 val] val is not a subtype of Array[U32 val] ref^".The compiler would check that
T
is one of {F32
,F64
,ISize
,ILong
,I8
,I16
,I32
,I64
,I128
,USize
,ULong
,U8
,U16
,U32
,U64
,U128
}.
All of this should be in the RFC.
If you tried to do
let foo: Array[U32] ref = #[1;2]
the compiler would say "right side must be a subtype of left side; Array[U32 val] val is not a subtype of Array[U32 val] ref^".
So #[] means "create a val array" and apply a specific optimization to it?
Well, not really, because it is more constrained than val array
, why is the specific type for the array constrained?
If this was done as an optimization, could we not have more types that could play, so you could for example, create Maps that could be optimized as such without needing special syntax, other user types could potentially play as well. If something like the Complex number idea that Ryan was thinking about was added, it could play in the optimization game as well.
The question might be "should all the possible options be done"?
This feels a lot more like something for an annotation to me. Where you can annotate the expectation of an optimization (or lack of).
#[]
means point a val array to a static section of data. Allowing arbitrary object types would mean interpreting them at compile-time and then serializing them into the binary.
In case it helps to have another way of saying it, I think I understand the direction Sean's describing: recovered val arrays would be optimized whenever possible to store them in the data segment. This would at first be only when the array elements are primitive numeric vals, but could later be expanded to elements that are string constants, non-numeric primitives, instances of classes with only embed fields, and so on. For predictable performance a new annotation on an array type would require this optimization, causing a compile error if it could not be done. Is that roughly what you're thinking, Sean?
We discussed this during the sync call https://sync-recordings.ponylang.io/r/2023_04_25.m4a.
In general, Joe is in favor. I am still against.
I raised by concerns for things I would want to see addressed to make sure we get right for the future related to interning in general (not just arrays of numbers) and the possible addition of compile time expressions.
The first 30 minutes of the linked sync call is about this RFC.
@jemc anything you want to add?
@mfelsche noted to me after the sync call that he has some thoughts on adding SIMD support to Pony that he sees as related to this RFC and would like to discuss at an upcoming sync in relation to this RFC.
Implementing static data can be tricky... To better understand the proposal:
class Foo[A: UnsignedInteger[A] val = USize]
let a: Array[A] = #[1; 2; 3]
let
fields in primitive
too.Can we have a generic static array definition like class Foo[A: UnsignedInteger[A] val = USize] let a: Array[A] = #[1; 2; 3]
Without compile time expression support, UnsignedInteger
would allow for classes which requires compile time expressions. So, given the scope of this RFC, not like that. But if it was a generic that was constrained specifically a single primitive number, then I believe under the ideas in this RFC, yes.
Why only arrays? I think there could be a relation to introducing let fields in primitive too.
primitives don't have fields, did you mean, primitives that are let fields in classes/actors?
primitives don't have fields, did you mean, primitives that are let fields in classes/actors?
No. Because primitive
creates unique instance, one can consider a field value to be a constant value because it'll exist once. If let
fields were added to primitive
, this would be static data and we could have static USize
or F64
constants too. Or we could have static arrays if the compiler optimized let
in primitives
as static data. I don't know if I can explain correctly what I mean...
But I'd rather have ponyta solution instead of this solution that appears more as a (temporary) hack than a global solution. The advantage is that the syntax is compatible with ponyta.
@kulibali will be updating this based on feedback and will let us know when he is ready for more review.
Rendered