Closed vasslitvinov closed 1 year ago
I don't disagree with the idea here, and don't think we should get rid of range
as a concrete type. However, I also don't think domain
has as obvious a concrete definition as range
does. So if we were to unify them, I'd probably just make domain
an error rather than a concrete type, indicating that it wasn't sufficiently specified (where domain(?)
could be used to get the current behavior). This would probably also involve renaming the _domain
record to domain
which I think would be a good thing.
[edit: I may have been a little off-base here, see next comment]
In today's range review (take 2):
domain
is not a special-case here w.r.t. other under-specified generic types (e.g., for a generic record R
that didn't have default values, R
means R(?)
); so range and domain are consistent with generic records; they're merely different than one another in this respect.domain
and range
symmetric in this respect.There are several directions we can pursue to reconcile the genericity of range
and domain
. Here are the top contenders:
Range's stridability is generic by default, domain's stridability is not, domain's rank is generic by default.
range
and domain
denote fully-generic typesrange(unit)
and domain(1)
are concrete types, meaning today's range
and domain(1)
Same as Proposal 1, except domain's stridability is generic by default.
range
and domain
denote fully-generic typesrange(unit)
and domain(1, unit)
are concrete types, meaning today's range
and domain(1)
The opposite of Proposals 1 and 2. The benefit is in making genericity explicit, see also #19120 and #19121.
range
and domain
are concrete types, meaning today's range
and domain(1)
range(?)
and domain(?)
unit
is a stand-in for an enum constant that corresponds to today's un-stridable ranges/domains, see #17131Please vote below and/or comment! I plan to prototype the most-voted or most-argued-for option to understand the impact.
Vote here for the above Proposal 1.
Vote here for the above Proposal 2.
Vote here for the above Proposal 3.
I would like to see genericity be explicit to avoid user confusion. This is related to making it explicit in functions https://github.com/chapel-lang/chapel/issues/19121 and in records/classes https://github.com/chapel-lang/chapel/issues/19120.
@vasslitvinov : How do associative domains like domain(string)
play with each of these proposals?
I voted for proposal 1 because even though it relies on an asymmetry between what is generic between the types, I believe it will require the smallest number of changes to user code, and also would consider it annoying to have to type unit
in order to write a multidimensional domain declaration (as in proposal 2), particularly after years of simply writing domain(2)
, say (though nowadays, I hardly ever provide types for dense rectangular domains).
Proposal 3 is unappealing to me because I don't think domain
should default to "1D domain" because it seems very arbitrary given all the things a domain could be (associative, sparse, multidimensional).
To Michelle's comment about making genericity explicit, I don't see proposal 3 as getting us there. To me, what would be required to achieve that outcome would be to change the status quo on this issue: https://github.com/chapel-lang/chapel/issues/18214 which essentially asked "If a generic type doesn't have defaults for all of its fields, should users have to write R(?)
or range(?)
(or variations that only leave certain fields as generic) rather than interpreting R
or range
as meaning "generic since not all of the generic fields are bound). The last time I brought it up, the discussion favored the status quo (but apparently not so overwhelmingly that I closed the issue).
All three proposals play nicely with associative and sparse domains -- because the following aspects remain unaffected:
The "fully-generic" domain type can be instantiated with an associative or sparse type. The proposals do not change this, they affect (or not) only how the "fully-generic" type is written.
Concrete associative and sparse domain types continue to be written the same way they are today, like domain(string)
and sparse subdomain(dom)
.
and also would consider it annoying to have to type unit in order to write a multidimensional domain declaration (as in proposal 2), particularly after years of simply writing domain(2), say (though nowadays, I hardly ever provide types for dense rectangular domains).
This is a good point. But not something that worries me too deeply because of that parenthetical. Or at least, rule simplicity of Proposal 2 is more appealing than the code simplicity of Proposal 1 to me.
I wouldn't argue too much for Proposal 2 over Proposal 1, though.
One addendum that occurred to me last night, which makes me less opposed to proposal 2 if I've got my facts right. I believe that code like:
const D: domain(2) = {1..n, 1..n};
would not break under proposal 2 because domain(2)
would be treated as a partial instantiation, and then unit
would be inferred from the initializer (?). So maybe it's not as bad as I thought in that "yes, it requires fully specified domain types to specify more information, but existing code likely won't break." Split initialization would even make cases like:
const D: domain(2);
D = {1..n, 1..n};
work. Where I worry most would be when there is a field:
record R {
const D: domain(2);
}
In such cases, my hope would be that split initialization-style interpretations would apply in the initializer context as well such that no changes would be required here either, but I'm not confident about how well that is working in practice these days.
This thread makes me wonder: What would the practical impact be of making none of the generic fields in range and domain have defaults? I.e., int
is obviously a good default for idxType
, but are we really gaining much from that default given that if it's unbound it'll be inferred from context? So maybe that suggests a proposal 4: Default none of idxType, stridedness, rank.
One other observation: In this thread, each time I see range(unit)
or domain(unit)
I read it as a typo as though the idxType is uint
. This makes me wonder whether unit is a poor identifier in this context and should be replaced with one
. I'll mention that on the relevant issue as well once I find my way back to it.
Notes from a subgroup discussion on 8/09.
Seven people -- almost everyone in the discussion -- liked "Proposal 4" from https://github.com/chapel-lang/chapel/issues/17122#issuecomment-1190580050, namely remove all defaults from the range and domain types.
If users need something convenient for a concrete or partially generic range/domain type, they can use a type alias, including a type alias for a partially instantiated type.
Do we want settle #18214 first, then have ranges and domains follow the same rules? Or do we want our conclusions on domains and ranges to feed into #18214 ?
stridable
is a bit unique because we have it mostly for optimization purposes. If we didn't care about efficient loop generation, we'd just have stridable ranges and the unit stride would be determined at execution time when needed.
The choice of the default depends on whether we view the defaults as "the most common case" or "the most flexible case". Today for ranges it is the former.
To keep in mind: domain
is different than our typical generic types including range
. That is, domain
can be instantiated into a rectangular domain with three (instantiated) generic fields or into an associative domain with two generic fields.
So "making domain
fully generic" means something different than saying that about ranges.
I propose consider making domain
similar to other generic types in this regard and introduce other type(s) for associative and perhaps sparse domains, rather than having it as a one-off. These other type(s) may (or may not) work in the same way w.r.t. arrays as domains do today, just be called differently.
I propose consider making domain similar to other generic types in this regard and introduce other type(s) for associative and perhaps sparse domains, rather than having it as a one-off.
I'd prefer not to do that, in part because I'm not convinced domains are broken in this regard in their current state, in part because it feels like a slippery slope where we might then feel the need to say "And there should be multiple generic array types rather than just [?D] ?t
(say), that express whether it's a sparse array or associative array, or ...
In today's module review meeting:
A viewpoint: what trips users the most is stridability. Proposal: remove the default from range.stridable
; keep the defaults on range.idxType
and range.bounded
.
A viewpoint: having some fields defaulted and some others not will be confusing to users, they will keep coming back to the documentation to check which is which. Defaulted generics are "sort-of generics." Proposal: remove all defaults; lean heavily on type aliases for common cases / to avoid exposing users to complexity.
type boundedRange = range(bounds = boundedKind.both, ?);
type simpleRange = range(int, boundedKind.both, strideKind.unit);
Related: user-defined type constructors #18817; meaning of writing a generic type's name #18214 #19120.
A viewpoint: the most important / most likely encountered by users are range literals. They should "just work." Handling range variables etc. is a secondary issue. So, let us make range
generic at least over stridability.
A concern: when a function is written to accept a range
formal and, upon an upgrade to the next Chapel version, starts to accept unbounded ranges, it will be surprising to the user.
A proposal: remove all defaults, however instantiating range
with just one parameter gives a fully-concrete type. For domains, the first parameter must be given to obtain a concrete type, so their behavior does not change. Pro: saves typing, ex. range(int)
is a concrete type. Con: gives ranges an unusual generic behavior, unlike any other generic type.
A suggestion for improvement to help with migration of range
to generic, as well as in other contexts: when the user defines a record with field(s) of generic type(s), they cannot declare variables of that record's type. If that causes a compilation error, the compiler should trace it back to generic fields of the record.
record R {
var rng: range; // assume this is generic now
var c: MyClass; // generic because no memory management
}
var r: R; // error: "did you mean to use a concrete range type for 'R.rng'?
// did you mean to specify the memory management for 'R.c'?"
The audience was split between keeping the status quo vs. removing some generic defaults. There was some support for removing all generic defaults and more support for introducing aliases.
The latest proposal is a variation on the above: remove (some|all) defaults from range; the name of the type itself should be a more extensive name and the basic case should be just "range", ex.
generalRange
- the generic typerange
- the concrete & simple type, same as today's range
, an instantiation of generalRange
This, however, will make explicit instantiations more verbose.
The latest proposal is a variation on the above: remove (some|all) defaults from range; the name of the type itself should be a more extensive name and the basic case should be just "range", ex.
- generalRange - the generic type
- range - the concrete & simple type, same as today's range, an instantiation of generalRange This, however, will make explicit instantiations more verbose.
I'm not a fan of this proposal. I think that 1..n by 2
should be able to be described simply a "range(...)" and not a type with a different / more verbose name—particularly given that when using generalRange()
, additional arguments are likely to be needed making the type name even longer. Overall, this proposal makes me feel like "generics have failed—we need two types to describe one thing."
An observation about today's genericity of domains by @benharsh :
proc foo(dom: domain(2)) // any 2D domain with ``idxType=int`` and ``stridable=false``
var D : domain(2); // initializes with DefaultDist by default
// warning: initializing local domain from distributed domain
// I suspect this is a carry-over from the "default-init, then assign" model, perhaps necessary in production
var S : domain(2) = D dmapped Block(D);
In short, a domain type expression in a formal's type expression appears to behave generically, while in other contexts it seems to behave as though it has a default for the hypothetical "distribution field". isGenericType(domain(2))
appears to return false today.
We need to account for domain's genericity over its domain map, with the natural default of default rectangular.
Another scenario to keep in mind, brought up by @benharsh : the following type expression is ambiguous because it can indicate either rectangular or associative domains (or both?).
domain(idxType=int, ?)
Here is the leading proposal crystalized from the discussions at the subteam meeting on 4/18. It relies on the proposed general rule that a generic type requires ?
[somewhere] within parens unless fully-defaulted. This rule is popular at the moment, however has not been formally accepted.
Keep the status quo with range
, in particular, keep it fully-defaulted. Generic range types are to be written using ?
, ex. range(?)
, range(stridable=?)
, etc. Improve compiler error messages to guide users in situations like var r: range = 1..3 by 2;
.
Keep the status quo in how we think about the genericity of domain
, plus disallow standalone occurrences of domain
. The fully-generic domain type is to be written domain(?)
, consistently with the proposed general rule mentioned above.
For example:
the rectangular domain type acts as a record with four generic fields, all of which except rank
have defaults: rank, idxType=int, stridable=false, domainMap=defaultDist
the associative domain type acts as a record with three generic fields, all of which except for idxType
have defaults: idxType, parSafe=true, domainMap=defaultDist
the sparse domain type acts as a record with one generic field, the parent domain, which has no default
therefore domain(2)
and domain(int)
are concrete types, distributed with the default distribution
Even though the proposal is to keep ranges and domains mostly as-is w.r.t. genericity, we now have a stronger rationale for their behavior.
We want to provide a concise way to express the default range
type because it is needed, or at least wanted, often.
We will no longer have the confusion between the naked range
being concrete vs. the naked domain
being generic - because the latter will be disallowed.
Note that currently the implementation does not allow us to write partially-generic domain types like domain(2,?)
or domain(int,parSafe=?)
. The initial plan is to hold off addressing this until we can do it in the dyno framework.
The following add-ons are more of Vass's thinking. They are here because they blend well with the subteam discussions.
Make the domain map the 4th generic parameter of rectangular domain types and the 3rd generic parameter of associative domain types, as a replacement for the dmapped
clause. In both cases, this parameter has the default of defaultDist
. These parameters can be "set" to a value / type / ?
. They can also be queried. https://github.com/chapel-lang/chapel/issues/17908#issuecomment-1513929470
(Base + Add-on 1) imply the following three de-facto type constructors for domains:
proc type domain.init(rank, idxType=int, stridable=false, domainMap=defaultDist) // rectangular
proc type domain.init(idxType, parSafe=true, domainMap=defaultDist) // associative
proc type domain.init(parentDomain) // sparse
Add a param property domain.kind
of the type enum domainKind { rectangular, associative, sparse }
.
Add one more type constructor that provides the generic types for "all rectangular domains', ditto associative and sparse:
proc type domain.init(param kind: domainKind)
This property can be queried. We might also allow types like domain(.... kind=xyz ....)
.
Uphold the special case we have today where a default rectangular domain value represents a type that is generic over the domain map when:
[domainValue] elementTypeOr?
, andRe add-on 1: the 4th argument may be a type or a value. If the latter, it is a sugar for the 4th argument being the type of that value. We also want to preserve the sugar we have right now for the dmapped clause.
Re add-on 2: domain(idxType=int)
and domain(idxType=int, ?)
may (or not) be ambiguous between rectangular and associative constructors. That sounds OK to me.
In order to represent the generic domain types discussed here, we need dedicated representation, rather than our _domain record whose type merely redirects to the implementation class. Perhaps it is already in the works in dyno?
Notes from the subteam meeting on 4/18.
The utility and frequent use of simpleRange
in #22108 made us feel that we need two ways to talk about the same thing (i.e., range(int,both,false)). This, in turn, made us think that we did not get it right [when we made range
am undefaulted generic]. Which nudged us back to considering the status quo, i.e., keep range
fully defaulted. We will improve compiler messaging to help users in cases like var r: range = 1..3 by 2;
If range
is fully defaulted, should we make domain
be fully defaulted as well? We have considered this option before and did not like it, primarily because there was no default domain type that was as clear a "winner" as the default range type. Here are our thoughts in this meeting. Should the fully-defaulted domain be rectangular or associative? Perhaps it should be associative. Then, there is no natural default index type. Choosing the default rectangular domain did not result in success.
One thought: having the naked domain
be a concrete type of 1-d rectangular domains could be justified by its being the simplest domain, not necessarily the most common. Chapel users will learn very quickly that they need to write more to get to other domains concrete or generic domains. Benefit: same look and feel as range. OTOH this default hardly serves a purpose as we do not expect 1-d domains to be common.
Another thought: domains are already special, for example by having the rectangular / associative / sparse distinction and the dmapped clause. So it makes sense for them to be different w.r.t. genericity, ex., have no generic defaults.
Proposal: the fully-generic domain type must be written domain(?)
, just like user-defined generic types. Note: we have not decided to introduce this general requirement [without which requiring ? on domain does not make sense], however we are leaning in this direction. Question: any surprises when we introduce this requirement?
Answer: not expecting surprises. When looking through Chapel code in:
modules/packages
test/release/examples
test/studies
CHAMPS
Arkouda
the only uses of domain
without parentheses are:
domain
as the type of a formalAnother thought: requiring domain(?)
is more consistent / less surprising than, for example, the error when passing a stridable range to a naked range
formal.
We still need to decide whether domain(2)
, for example, is generic over its domain map. Currently it is not, except in one special case (see "Add-on 4" in https://github.com/chapel-lang/chapel/issues/17122#issuecomment-1516693176).
Question: to help decide, how are ranges and domains used in Arkouda and CHAMPS? Answer: see the next comment.
Question: how do CHAMPS and Arkouda use ranges and domains? Answer: see the tables below. The left column contains the occurrence counts.
Arkouda ranges | |
---|---|
7 | range(stridable=true) |
4 | range(stridable=false) |
3 | range(?) as a formal's type |
1 | var segRange: range = 1..0; |
1 | (bool, range) as a return type |
Arkouda domains | |
---|---|
300 | xyz.domain |
3 | domain(1) as a formal's type |
9 | domain(1) as an array element type |
2 | domain(1) dmapped Block(...) as a var's type |
1 | domain(1) as a const's type |
1 | domain(t, parSafe=false) |
1 | domain(string) as a var's type |
CHAMPS domains | |
---|---|
240 | var xyz : domain(1) = a domain literal |
328 | const xyz : domain(1) = a domain literal |
8 | var xyz : domain(1, stridable=true) = a strided 1-d domain literal (*) |
43 | var / const xyz : domain(2) = a 2-d domain literal |
2 | const xyz: domain(3) = a 3-d domain literal |
(**) | |
120 | xyz.domain |
26 | domain(1) as the formal type |
2 | domain(1) as the return type |
18 | const .*: domain(1) dmapped ... |
10 | var xyz: domain(1); |
2 | domain(string) as a var's type |
(*) in one case out of 8, it is a by
expression
(**): in the cases above this line in the table the type annotation is unnecessary, i.e., if it were dropped, the inferred type of the variable would be the same
CHAMPS ranges | |
---|---|
86 | const xyz:range = [low]..#[count]; |
23 | const xyz:range = abc.variablesDom_.dim(0); |
9 | var xyz:range = [low]..#[count]; |
3 | const xyz:range = [low]..[high]; |
2 | new map(int,range) |
2 | range as a formal's type |
1 | var elements_ : [sectionDomain_] range; |
The conclusion from the design sub-team discussions last week is to make almost no changes to ranges and domains w.r.t. their genericity. The only change is to require the generic domain type to be written as domain(?)
and any partially-instantiated domain type to be written as domain(..., ?)
as in #21410
This is discussed under "The proposal" above in https://github.com/chapel-lang/chapel/issues/17122#issuecomment-1516693176 . It is contingent on accepting the general rule requiring (?)
after any non-fully-defaulted generic type. If we end up not accepting this general rule or accept another rule, we will apply that to domains as well.
We are keeping range
fully-defaulted and domain
fully-generic for 2.0, as discussed on this issue and elsewhere. I removed the 2.0 tag.
I think we could go so far as to close this issue. I can't imagine we'll change the decision given the time we've invested in it recently, barring some user complaining about something strenuously and persuasively (in which case I think they should open their own issue to do so).
I agree. Closing as resolved.
This is perhaps a different angle at #13566
Currently
range
is a concrete type vs.domain
is a generic type. This is non-intuitive and a newcomer is likely to bump into it at some point.This issue proposes making them either both concrete or both generic.
Pro: makes Chapel easier to learn (and easier to use??) Con: makes existing Chapel users unhappier