Design: better naming/arrangement of `dist` method and the `dmapped` keyword

chapel-lang / chapel

a Productive Parallel Programming Language

https://chapel-lang.org

Other

1.8k stars 423 forks source link

Design: better naming/arrangement of `dist` method and the `dmapped` keyword #17908

Open vasslitvinov opened 3 years ago

vasslitvinov commented 3 years ago

Background: the domain keyword describes the generic domain type; it can also be used to query the domain value of an array expression.

This issue seeks elegant counterparts of these -- for the domain-map type and query.

Right now a domain-map value can be created:

implicitly in a dmapped clause of a domain type or value, or
explicitly with a new dmap(....)

The domain-map value can be queried using:

the dist method on domain expressions

Considerations about the dist method:

it is an abbreviation, while domain is spelled out; however, "distribution", when spelled out, looks too long, so we may not want that
it should refer to "domain maps"
- whereas "distributions" are a subset of "domain maps" that refer to multi-locale domain maps, vs. "layouts" are domain maps that do not address multi-locality
- we could also redefine a "layout" to be a special case of a "distribution" ("how are these indices distributed (stored) into this single memory?"), rather than a disjoint concept

mppf commented 3 years ago

we could also redefine a "layout" to be a special case of a "distribution" ("how are these indices distributed (stored) into this single memory?"), rather than a disjoint concept

This is appealing to me partly because I think dmapped doesn't flow well in English descriptions of code - I'd rather say "The array declaration specifies a distribution" than "The array declaration specifies a domain map". I think users with some HPC background know what "distributed data" or a "distributed array" are in a general sense but "domain map" is Chapel-specific jargon.

Additionally we have relatively few layouts and so they seem to be more of a corner case than something we need to create the syntax around. IIRC before dmapped we had a distributed keyword and I guess I find that more intuitive but I know it's not great to go backwards on these decisions.

Regarding new dmap it seems to me that it should just be what happens when you create a Block distribution, say. If we need to have newBlockArr we could just as well have newBlockDist that returns the dmap-wrapped class. Along those same lines, the fact that we need newBlockArr at all is arguably pointing at some insufficiency in the dmapped syntax. Could we achieve the same idea with something more like an explicit array initialization? E.g. what if you could use new to construct Chapel arrays/domains more broadly - could we use that strategy to solve this same problem?

  const D: domain(2) dmapped Block(boundingBox=Space) = Space;
  // could it be this?
  const D = new BlockDom(boundingBox=Space, initFrom=Space)

(I know that in the implementation, BlockDom and _domain(BlockDom) are different types; but this seems to be an implementation detail that we could paper over in some way for such a new expression).

e-kayrakli commented 3 years ago

we could also redefine a "layout" to be a special case of a "distribution" ("how are these indices distributed (stored) into this single memory?"), rather than a disjoint concept

I like this direction.

There are few things that we may have to wrestle with, but I don't see them as blockers:

We may still need a compile-time user-facing way of separating a multi-locale distribution from a single-locale one. My reflex is to have distribution.isDistributed or distribution.distributed but they seem.... tautological?
Could this enable a different way of creating distributed sparse domains? Today's way of creating a distribution that knows about the sparse domains' layouts feel clunky to me. We can probably adjust that separately from this, but if we can make today's layouts and distributions interact more closely, we might be able to say (1) create a distributed dense domain, (2) now create a sparse subdomain on that with this-and-that layout.

Along those same lines, the fact that we need newBlockArr at all is arguably pointing at some insufficiency in the dmapped syntax. Could we achieve the same idea with something more like an explicit array initialization? E.g. what if you could use new to construct Chapel arrays/domains more broadly - could we use that strategy to solve this same problem?

To me, the annoyance that those helpers work around is the fact that you need to specify the boundingBox/startIdx etc and it is not about the dmapped syntax per se. If we were to have mapped for example that read better in English, the following seems pretty sweet to me:

var arr: [{1..10} mapped Block] int;

whereas

var arr: [{1..10} mapped Block({1..10})] int;

is significantly worse to me.

In general, I use newBlock* et al. a lot, but if we had a cleaner way of declaring arrays like the above, I would use them.

bradcray commented 3 years ago

The directions that @mppf and @e-kayrakli are discussing here definitely align with my own thinking about the problems with dmapped and some directions forward (though, left on an island, I might come up with slightly different details). In particular:

I'm not fond of any codes that use dmap
I find dmapped clunky as a keyword relative to our other keywords

At some point in the distant past, @vasslitvinov proposed replacing dmapped with something more like implemented or stored and supporting it on other types as well, though I'm having trouble remembering what types (syncs / singles / atomics come to mind?)

w.r.t. {1..10} mapped Block, it's been proposed several times that we have a version of Block that infers its bounding box from the first non-empty domain created over it (and similarly Cyclic would infer its starting index in that way). I think we've even prototyped this a few times, but never successfully gotten it on master, though I can't recall whether that was for good reasons or bad.

I find this convenience feature attractive (particularly to not have to explain the bounding box argument in every Chapel tutorial I give), but it does potentially suggest that if I later re-size my domain from 1..10 to 1..20 it will be re-blocked (which we don't really support today). So there's either a big user education problem looming here ("Why is 99% of my data on the last locale?", or we'd need to decide that an inferred bounding box implied re-blocking on every re-assignment to the domain (which I think some users requested recently anyway—"give us that productivity, execution time cost be damned!").

e-kayrakli commented 3 years ago

but it does potentially suggest that if I later re-size my domain from 1..10 to 1..20 it will be re-blocked (which we don't really support today). So there's either a big user education problem looming here ("Why is 99% of my data on the last locale?", or we'd need to decide that an inferred bounding box implied re-blocking on every re-assignment to the domain (which I think some users requested recently anyway—"give us that productivity, execution time cost be damned!").

This sounds reasonable to me. Do we have a case today where we want to resize a distributed domain but keep its distribution's bounding box the same (i.e. the status quo works in favor of the user) ?

bradcray commented 3 years ago

Do we have a case today where we want to resize a distributed domain but keep its distribution's bounding box the same (i.e. the status quo works in favor of the user) ?

I think that, in practice, it's pretty rare to re-assign distributed domains at all once they're initialized. More often a number of domains are interacting with a single distribution—say a number of domains represent rows/columns/blocks of a 2D distributed domain where we all want them to share the same distribution, not each individually get its own block distribution based on its indices. So you could imagine a situation where, if I were iterating over rows or columns, I might reassign a single distributed domain representing the current row / column rather than creating it from scratch each time. But in such cases, the domains will typically be created either be created by explicitly sharing a single distribution, or by operations on the original domain. I'm not aware of any key current cases where a single domain has its own distribution and is reassigned multiple times.

vasslitvinov commented 1 year ago

While discussing range vs. domain genericity in #17122, @benharsh suggested a view of the generic rectangular domain type as having four generic arguments:

domain(rank, idxType, stridable, domainMap)

We can use this direction to get rid of the dmapped keyword. First, assume that three out of these four arguments have defaults, like:

domain(param rank, type idxType = int, param stridable = false, domainMap = defaultDist)

Then, allow the domainMap argument to be a type. An instance of that type may be obtained implicitly when needed to create a domain value.

User code can look like this:

// types
domain(2)                   // default 2-d domains
domain(2, domainMap=Block)  // Block 2-d domains
domain(2, domainMap=?)      // 2-d domains with any domain map
domain(2, ?)                // all 2-d domains, including different idxTypes and stridability

// variables/fields
var d1: domain(2);          // a default 2-d domain
var d2: domain(2, domainMap = Block(...));  // Block-distributed

// domain literals
{1..n}                      // a default domain
{1..n, Block(...)}          // a Block-distributed domain
{1..n, 1..m, Block(...)}    // a Block-distributed domain

lydia-duncan commented 1 year ago

@vasslitvinov - based on a message you sent to Michelle, Brad and I, it sounds like these have been marked unstable and the issue can have the 2.0 tag removed. Is that accurate?

vasslitvinov commented 1 year ago

We have renamed .dist to .distribution. For 2.0, the dmapped keyword will be unstable. Factory functions #22427 will be a stable replacement for dmapped.

Nothing else needs to be done for 2.0. I removed the 2.0 label.

bradcray commented 1 year ago

Personally, I would still like to have a stable replacement for the dmapped keyword for the fall release and have been wrestling with the question while working on getting rid of dmap cases. But I'm fine not having the label here for the time being.

bradcray commented 1 year ago

I broke the specific question of dmapped name replacements into #23128 to give it some new, top-level attention and capture and summarize some (hopefully most/all?) of the proposals given on this issue.