chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.79k stars 421 forks source link

Should map provide references to values? #14474

Open mppf opened 4 years ago

mppf commented 4 years ago

The current map accessor method returns a ref to the element.

    proc this(k: keyType) ref { ... }

See also

https://chapel-lang.org/docs/master/modules/standard/Map.html#Map.map.this

There are a number of other map methods that have this property.

Issue #8339 points out that this leads to a race condition in programs like the following:

forall key in input {
    map[key]++;
}

because map[key] returns a reference that might be invalidated in another task when a new element is added to the map.

Is that what we want for the built-in map type given Chapel's emphasis on parallelism?

What options do we have to solve this problem?

  1. The map data type does not provide references to elements. Instead users would use things like a set method. Note that this strategy presents challenges with maps containing owned or atomic that will need solutions if this strategy is adopted.
  2. The map data type could be written in a way so as not to move elements ever. This presents some implementation challenges, however:

    • it really limits how the map type can be implemented, or

    • the map type would be implemented out of an integer map and also a list as described in #8339. But this adds overhead and prevents the map type from being as simple a building block as it could be. While this might make sense sense for associative arrays, for the map, we might be better off not doing it.

      1. Create different APIs for a parallel-safe map and a parallel-unsafe map
      2. Users of the map type need to write loops like this in a special way to avoid the race condition
    • Expose synchronization primitive

      forall key in input {
      map.acquire(key);
      map[key]++;
      map.release(key);
      }

      or (in a style more like #12306 )

      forall key in input {
      atomic on map[key] {
       map[key]++;
      }
      }
    • Use a transactional approach

      forall key in input {
      map.computeKey(key, (key, value)  -> value + 1);
      }
LouisJenkinsCS commented 4 years ago

Option 1: Expose synchronization primitive

forall key in input {
   map.acquire(key);
   map[key]++;
   map.release(key);
}

Option 2: Use a transactional approach

forall key in input {
   map.computeKey(key, (key, value)  -> value + 1);
}
LouisJenkinsCS commented 4 years ago

Note that concurrent increments need to be protected by some atomic which you cannot actually return by value.

dlongnecke-cray commented 4 years ago

You know, earlier today I was thinking much the same thing as @LouisJenkinsCS has suggested here, which is that containers could expose acquire and release routines to the compiler which forall loops and other data parallel constructs could call automagically...

LouisJenkinsCS commented 4 years ago

Not automatically, please. Deadlock is a problem if multiple keys need to be grabbed; deadlock prevention such as imposing an ordering should be up to the user, as well as livelock avoidance.

dlongnecke-cray commented 4 years ago

No, it's definitely not a feasible idea, just rolling ideas around.

e-kayrakli commented 4 years ago

I wrote a longish answer but then got confused.. The specific case above should be covered by the parSafe flag to the map, shouldn't it?

In case I am missing something, as an obtuse answer, I like map[key] = value idiom. So I don't like switching to set/get methods that much. If there is a complication that I am not seeing right now, extending the map API to cover for these complications seems like a reasonable approach to me. I see @LouisJenkinsCS "exposing synchronization primitives" above as a good example of that.

Edit: Maybe we are talking about reference invalidation issues in general. e.g.

forall i in input {
  ref myRef = map[key];
  myRef++;
  // potentially bunch of other stuff with `myRef`
}

In which case, I think the above paragraph still applies as my answer.

LouisJenkinsCS commented 4 years ago
forall key in input {
   // What happens if `key` is not in the map? 
   map[key]++; // Assumes if `key` is not present it will insert it first
}

Even in the case where key must be added to the map via some insertIfNotPresent call, that by itself will invalidate other tasks running in that same loop. map[key]++ retrieves the value corresponding to key via the returned ref, and then increments the value and writes it back through same ref. This is a classic read-modify-write problem, except the issue isn't just atomicity, it's the fact that the ref can become 'invalid' after the ref is returned. Let me know if this is unclear or if I misunderstood your question.

e-kayrakli commented 4 years ago

@LouisJenkinsCS -- thanks that was a rookie confusion on my part.

Without too much familiarity with the Map implementation we have today, I hope that your assumption that "if key is not present it will insert it first" is wrong. And a quick test showed that I got an OOB in that case.

I say this partly because I hope that we can get something more coarse grained than this:

forall key in input {
   map.acquire(key);
   map[key]++;
   map.release(key);
}

I can see the same notion applied to the map as a whole, like map.lock and map.unlock or something that is called outside the loop as a better option.

But anyways, stepping back a bit, I still think that refs can be provided via proc this or iterators etc. We can extend the API one way or another to protect for some of these cases.

LouisJenkinsCS commented 4 years ago

That's fine if it provides coarser grained synchronization over the map, it just needs to have much, much finer grained synchronization options as well, especially in distributed contexts.

For example, if this was enforced as part of an interface, it would be far too restrictive, and I think that whatever we use should not become standard, just specific to the implementation of the map. For example, the map that Garvit and I have been working on could not support such a coarse grained lock, nor would it be necessary given its design; if the map was rejected as an addition to the language because it did not fit the standard to a tee, it would be a great shame.

To reiterate, I think the Map interface should only contain what is important and should be minimal; the map available in the module should be an implementation of this specification, and not represent the specification itself.

mppf commented 4 years ago

Without too much familiarity with the Map implementation we have today, I hope that your assumption that "if key is not present it will insert it first" is wrong. And a quick test showed that I got an OOB in that case.

That doesn't seem right. The map documentation for proc this says

Get the value mapped to the given key, or add the mapping if key does not exist.

In any case, it doesn't actually matter for the race condition that this issue is talking about.

forall key in input {
  map.add(key);
  map[key]++;
}

Note that in the above I have been assuming that the map held atomics, and that ++ was doing an atomic increment.

My personal expectation here - is that when a user writes a loop like this in a naive manner using a map of atomics - they should not get race conditions and core dumps.

@e-kayrakli wrote:

If there is a complication that I am not seeing right now, extending the map API to cover for these complications seems like a reasonable approach to me.

I think, since Chapel is focused on parallel computing, the parallel-safe map type provided in the standard module should not lead users writing the "obvious thing" to core dumps. Extending the API to have a different way that doesn't core dump will not address this problem. (But it could be part of a solution).

@LouisJenkinsCS - thanks for the suggestions I will put them in the issue description as options.

Option 1: Expose synchronization primitive

forall key in input {
   map.acquire(key);
   map[key]++;
   map.release(key);
}

This is actually pretty similar to #12306 in spirit, and generally speaking I'd be happier with it as a strategy if it used some kind of atomic on or something as 12306 describes.

Option 2: Use a transactional approach

forall key in input {
   map.computeKey(key, (key, value)  -> value + 1);
}

If we had set methods only and didn't expose references, we could make this be the only way of updating an atomic element (otherwise compiler error). That would meet my desire for a user writing the obvious loop to not have it fail as a race condition / heisenbug core dump.

mppf commented 3 years ago

@dlongnecke-cray - I would think we could close this issue now, since we have at least a "for now" solution of prohibiting the references returned with a parSafe=true map?

vasslitvinov commented 2 years ago

How about reference invalidation in purely-serial code, when parSafe=false?

ref element = map[key];
map.add(....); // shuffles the locations of existing elements
writeln(element); // 'element' is an invalid reference now
vasslitvinov commented 2 years ago

For the cases like the above, we can take the position:

For example, we already have this issue with Chapel arrays:

var D = {0..0};
var A: [D] real;
ref r = A[0];
D = {-1..1000}; // likely reallocates A's elements
writeln(r); // likely invalid ref

We have not been concerned with this because this scenario is unlikely.

bradcray commented 2 years ago

How about reference invalidation in purely-serial code, when parSafe=false?

I'm personally comfortable leaving such cases as user errors. Another similar case would be:

var myC = new unmanaged C();
ref x = myC.x;
delete myC;
lydia-duncan commented 1 year ago

After the recent deep dive on the distributed map implementation (see #21039 for the draft code), it seemed like a good idea to talk more deeply about reference invalidation (accessing memory that is no longer valid). Michael and I have come up with the following list of pros and cons to seed discussion:

Pros and cons of requiring the impl to not do reference invalidation Pros:

Cons:

Note:

Questions to answer:

dlongnecke-cray commented 1 year ago

Incomplete guarantee, deletion and mutation are still subject to race conditions

Can you explain this a bit more for the uneducated like myself 😄?

It seems like if we have no way to get an uncontrolled reference (where by controlled, I mean as something like a formal in an update method) to a collection element, then reference invalidation should not matter. We could still have race conditions irregardless of this, but I don't understand how those relate to reference invalidation.

On the whole I'd say that if we are going to make the effort to require that collections ensure referential integrity for common operations, then it is almost certainly to enable support for [] or proc get(...) ref or some way of getting an uncontrolled reference. If we've decided (perhaps for other reasons) not to support that operator, then I don't see any reason why we should require referential integrity.

vasslitvinov commented 1 year ago

requiring the impl to not do reference invalidation

@lydia-duncan I am blanking on this... What does it mean?

dlongnecke-cray commented 1 year ago

I'm not Lydia, but maybe I can help move the conversation forward...

If we used a classic hashtable implementation, certain operations would trigger the table to do a resize + rehash, which shuffles all the elements around in memory. This would be OK except that we might like to offer proc this(k: keyType) ref: valType some way to get a reference to a value out of a map. And the moment the map resizes, or even if that element is removed from the map, any reference to that element is invalidated.

That much is probably obvious to you, but requiring the map to not do reference invalidation basically means adopting an implementation where elements are not moved after being inserted into the map. [We may not require this guarantee for every operation, e.g., map.clear() would always invalidate references - because how could it not?]

lydia-duncan commented 1 year ago

requiring the impl to not do reference invalidation

@lydia-duncan I am blanking on this... What does it mean?

@vasslitvinov I've updated my previous comment to include the reminder I provided in the email drawing the team's attention to this post. Between that and David's comment, does that help?

Incomplete guarantee, deletion and mutation are still subject to race conditions

Can you explain this a bit more for the uneducated like myself 😄?

It seems like if we have no way to get an uncontrolled reference (where by controlled, I mean as something like a formal in an update method) to a collection element, then reference invalidation should not matter. We could still have race conditions irregardless of this, but I don't understand how those relate to reference invalidation.

Sure! I think this disconnect is due to how I've presented the list, apologies. The pros and cons here are solely about what it would take to enable [] while making it as safe as possible - it's starting from the position of "it seems like people want [], are the tradeoffs worth it?" This con in particular is saying that we can't guarantee the avoidance of reference invalidation completely while still providing references. Even in the case where we provide a lock for an individual element for modification (which is probably the safest way), we can still end up in a race when one task removes the element and another tries to give it a different value. That's obviously not something the user should have tried to do, but it's not something we can really protect against either, so our guarantee would always be at least a little incomplete.

On the whole I'd say that if we are going to make the effort to require that collections ensure referential integrity for common operations, then it is almost certainly to enable support for [] or proc get(...) ref or some way of getting an uncontrolled reference. If we've decided (perhaps for other reasons) not to support that operator, then I don't see any reason why we should require referential integrity.

I believe our decision not to support that operator was purely due to this concern about reference invalidation. At least, I haven't heard anyone bring up a different objection to providing it and it seems like there is a desire to have it. I guess part of what I'm wondering as part of this discussion is if our desire to provide it is sufficiently strong that we should ensure our implementation has that property, even if it later comes with performance penalties or makes the implementation more complex or something along those lines. Before the meeting last week, we'd been exploring the implementation from the perspective that we wouldn't provide [], but maybe that's the wrong choice and we want to be certain we've fully thought through our decision.

jeremiah-corrado commented 1 year ago

If we decide that we want to be able to provide references into the map, here is another approach to consider (similar to the synchronization primitive idea from above):

The user facing interface for the context manager might look like this:

use SemiStaticMap;

var m = new semiStaticMap(int, int);
var srm = new staticRefsManager(m);

forall i in 0..100 with (ref srm) do
    manage srm as sm do
      sm[i % 10] += 1;

Under the hood, the context manager would increment an atomic counter in the map upon entry, and decrement it upon exit. Whenever this counter is decremented to zero, the map would check its load factor and rehash if needed. (This would still invalidate references created outside of a managed-context).


record staticRefsManager {
    var managedMap;

    proc init(map) {
        this.managedMap = map;
    }

    proc ref enterThis() ref: semiStaticMap {
        this.managedMap.incrementStaticRefsCount();
        return this.managedMap;
    }

    proc ref leaveThis(in err: owned Error?) {
        this.managedMap.decrementStaticRefsCount();
        this.managedMap.maybeRehash();
        if err then try! { throw err; }
    }
}

record semiStaticMap {
...

proc maybeRehash() {
    if this.staticCount.read() == 0 {
         if this.loadFactor() > LoadFactorThreshold {
            this.lock();
            this.reahash(this.size * 2);
            this.unlock();
       }
    }
}

...
}

This approach might be useful for addressing the reference invalidation problem (at least in the context of parallel loops) for the distributed and parallel maps.

It provides the additional benefit that the map can still rehash if it becomes much larger than the default size.

mppf commented 1 year ago

one additional Pro of providing the square bracket / preventing the implementation from doing reference invalidation is that it smooths over some special stuff that would have to happen with owned. For owned values in a map, to if you have something returning a reference to an owned value, that is easy enough to work with. But if, for example, you just sought to have myMap[i] return by value (since you can't return by reference), then that would represent ownership transfer for owned, but that is probably not what is usually desired when accessing elements in a map of owned values. As a result, today we have getBorrowed but that method has the problem that it doesn't work for all element types (it only makes sense for class types). If I am writing generic code using a map, should I used getBorrowed because the element type might potentially be an owned class? That seems like something that could trip up developers, and the whole thing is avoided if we allow returning references with myMap[i] (and the references are not invalidated when adding to the map).

lydia-duncan commented 1 year ago

One other thing we noticed while Jeremiah was working on his alternative implementation is that the internal ChapelHashtable implementation today tracks when an item was deleted instead of immediately deleting the item. We believe this strategy can be used to avoid the potential for races between an update and a deletion to the same pair.

mppf commented 1 year ago

@jeremiah-corrado - I think the idea of decoupling the rehashing from the usual updates is interesting. I am worried it adds some complexity that users have to be aware of (at least to get good performance) and it's complexity they might not be familiar with from other languages / maps. And of course, there are other approaches that just never move elements -- but I don't know if we can estimate the performance difference between a tell-me-when-to-rehash approach and something that stores all values in separate heap elements all the time.

I can think of a few related ideas:

  1. In the associative array / associative domain implementation, IIRC, actually it does use this kind of idea (maybe just putting off deletion?) in order to implement resizing several arrays when their domain changes.
  2. Potentially, we could use a "smart pointer" idea simply to cause the map not to rehash yet. I think we have a number challenges to get "smart pointers" to work at all, though. (Mainly: they should behave like references, but references are built-in types, and we don't currently have user-defined implicit conversion.) At least it would be a pretty simple use case, though.
  3. Some languages have an idea of "freeze"ing a map once the set of keys are known. This idea is interesting algorithmically because, IIRC, it's technically possible to construct a perfect hash at that point. I don't know if any implementations actually do that, though. Nonetheless, they might get performance benefit from the concept in other ways.
jeremiah-corrado commented 1 year ago

I am worried it adds some complexity that users have to be aware of (at least to get good performance) and it's complexity they might not be familiar with from other languages / maps

I have the same concern about introducing an unfamiliar interface to a familiar data structure. However, I do wonder if this is an area where we could potentially afford to innovate rather than relying on precedent.

w.r.t (at least to get good performance), I think that is only true if the initial capacity of the map is much smaller than the required capacity. See the following point...

but I don't know if we can estimate the performance difference between a tell-me-when-to-rehash approach and something that stores all values in separate heap elements all the time

(w.r.t something that stores all values in separate heap elements all the time, you might be referring to a different implementation that I'm unaware of, so the following might be irrelevant):

In my prototype implementation, I built the hash-table as an array of linked lists. Each array entry corresponds to some hash value, and new keys with that value are appended to the end of the linked list. As the map's load factor gets much larger than 1, access times necessarily get pretty bad because a linear search across the relevant list is required.

This is avoided by either (a) correctly predicting the size of the map ahead of time, s.t. the array is long enough to keep the load-factor low, or (b) allowing the map to grow the array and rehash when load factor gets too large.

So, in scenarios where users either choose a bad capacity to begin with, or don't have a good size-estimate when the map is constructed, it seems like the rehashing map would be able to provide better access-time-performance. (compared to a map with the same implementation that is never allowed to rehash).

Some languages have an idea of "freeze"ing a map once the set of keys are known.

This idea is also really interesting to me. It sounds especially useful for the histogram-ish pattern, where there is an initial construct-the-histogram phase, and then an analyze-the-histogram phase. (where, during the 2nd phase, perfect hashing would potentially improve performance and the frozenness would obviate the need to worry about reference invalidation).

mppf commented 1 year ago

In my prototype implementation, I built the hash-table as an array of linked lists. Each array entry corresponds to some hash value, and new keys with that value are appended to the end of the linked list. As the map's load factor gets much larger than 1, access times necessarily get pretty bad because a linear search across the relevant list is required.

Right, so at best, accessing a hashtable element needs to access 2 cache lines: the hashtable entry & then the value stored off of a pointer there. In fact I think it is similar to the simpler approach of storing classes containing the values in the existing hashtable (sortof storing Box(valueType)). Compare with our current implementation which just needs to access 2 cache line in the best case. I would imagine there is another way to have better cache efficiency while not moving elements all of the time.

I was imagining that an implementation could have a sort of overflow buffer where it would store elements added since the last rehash-point; this itself could be implemented with a linked-list hashtable like you have. Then on a rehash-point, it could migrate these values to be stored directly in usual open addressing way. (I am not trying to say that is necessarily better; just something I was thinking about).

jeremiah-corrado commented 1 year ago

Ok, that all makes sense, and I like both of those ideas. I think I'll leave my prototype as is for now, and then potentially switch to one of those implementations (boxed open addressing, or open addressing w/ overflow) if we decide to pursue this implementation direction further.

lydia-duncan commented 1 year ago

Tying this back to the main drive of the discussion, it seems like:

With these additions in mind, and the previously described pros and cons, plus the precedent from other solutions, how do folks feel about whether or not we should support [] and potentially getReference with the guarantee that they will not invalidate references?

mppf commented 1 year ago

how do folks feel about whether or not we should support [] and potentially getReference with the guarantee that they will not invalidate references?

My 2¢ : I think we should support [] and add the guarantee that we won't invalidate references until certain events occur. I think that there are many ways that we can potentially implement this & we should (eventually) study those with an eye towards high performance. (To reach API stability, we can use a simple approach, like creating a class instance to hold each hashtable value). I think it is likely we will still need a slightly different API to do aggregation. That interferes a bit with gradually migrating from sequential/local code to parallel/distributed code, but I think that is OK.

vasslitvinov commented 1 year ago

At a high level, I would like Chapel to provide as many safety guarantees as possible. In this case, it would be either no [] or invalidation-safe [], for example via the use of smart pointers behind the hood. This is because Chapel is a high productivity language. Each safety hole takes us farther away from this title.

My understanding is that this is where the discussion is going. If we allow occasionally-invalidating [] today, we can plan for and gradually work towards tightening the safety guarantees and improving performance while remaining backwards-compatible.

Other thoughts here:

We could enact more costly safety guarantees upon --checks and fewer guarantees upon --no-checks.

I would like us to consider the value-returning overload of [] to return a borrow in the case of owned elements. The ownership transfer will happen upon a call to a different method -- or upon a [] with an additional enum argument saying "yes, give me the ownership, if the element type is 'owned'". Alternatively, each map instance can have a param flag indicating which behavior to choose upon []. In any case, a map with non-'owned' elements will accept owned-specific arguments and ignore them, for the sake of clean generic code.

The design decisions we accept here need to apply to all our standard collection types. Obviously, we do not want maps to be one way and vectors or sets the other way.

mppf commented 1 year ago

I would like us to consider the value-returning overload of [] to return a borrow in the case of owned elements. The ownership transfer will happen upon a call to a different method -- or upon a [] with an additional enum argument saying "yes, give me the ownership, if the element type is 'owned'". Alternatively, each map instance can have a param flag indicating which behavior to choose upon []. In any case, a map with non-'owned' elements will accept owned-specific arguments and ignore them, for the sake of clean generic code.

Why is having [] return a borrow in the case of owned elements important? It seems to me that the regular behavior will be to return a const ref or possibly a ref and I think this leads to the capabilities I would expect for owned. For example, myList[i] = returnsANewOwned();. Making a special case here would add to complexity of each data structure implementation that follows the rule. In my opinion, the generic code works better when we have owned working as a ref.

dlongnecke-cray commented 1 year ago

I think my stance is still "not right now" on the []. Maybe we could support it in the future. Michael seems confident about it and he's helped a ton with the current list implementation, which avoids invalidation on append().

As much as I want to agree, I think it would spare us pain if we just don't worry about this until after other things are stabilized.

If we do have to support it in the initial API, I think we should support it without any task-safety guarantees right now and document that this is the case. I think that's backwards compatible with future uses in non-serial code (or the user is already doing their own locking).


In classic me fashion, I had not considered getBorrowed. I'd much prefer eating the cost of having [] over keeping API workarounds.

mppf commented 1 year ago

@dlongnecke-cray - I think my main objection to that approach is that it means we will have to stabilize things like map getBorrowed which I don't like (since it seems like it'd be super awkward to use it in generic code if you don't know if you are working with a class type or not). Vs if we allow [] then the straightforward code can work. I also don't think there's any particular challenge to supporting [] in a way that avoids the reference invalidation since we have the approach of adjusting the data structure implementation to store each map element in its own class instance. (That will add overhead and reduce performance in some cases but it would allow us to stabilize the API for Chapel 2.0 and leave improving the performance for the future).

dlongnecke-cray commented 1 year ago

Thanks, Michael. I agree that stabilizing getBorrowed is a more ugly workaround.

I'm on-board the [] train.

I'm sure we'll figure out a super cool, Chapeltastic implementation in the future when given enough time, so as long as the initial approach doesn't cost much developer pain I'm all for it.

lydia-duncan commented 1 year ago

We've decided we'll continue to support [], closing

mppf commented 1 year ago

Is there an issue somewhere to track the TODO of making map not have reference invalidation when adding keys?

lydia-duncan commented 1 year ago

I don't really feel like I have time to look for that, so I'll reopen this until I have that time