Design for getting an unmanaged handle to a managed class

chapel-lang / chapel

a Productive Parallel Programming Language

https://chapel-lang.org

Other

1.78k stars 420 forks source link

Design for getting an unmanaged handle to a managed class #23731

Closed jabraham17 closed 8 months ago

jabraham17 commented 11 months ago

As of Chapel 1.32, we do not support a public way to get an unmanaged handle to a managed class that doesn't affect the lifetime of the object. Prior to Chapel 1.31, the way to do this was with a cast from a borrowed to unmanaged.

class MyClass {}

var a = new owned MyClass();
var b = a : unmanaged; //implicit borrow of a; a's lifetime is unaffected
//var b = a.borrow() : unmanaged

This allows a user to write that majority of code using a managed class and then opt out of the lifetime checker for some cases.

To restore this ability to Chapel, I propose two options

Restore the cast from borrowed to unmanaged
Add a new type method on borrowed which does the same thing
- This PR has an implementation of a borrowed.release method, although this is just a prototype name
  - other names: borrowed.get, borrowed.toUnmanaged, borrowed.getUnmanaged
  - Another option would be a type method on unmanaged: e.g. unmanaged.get, unmanaged.fromBorrowed

I would like to make the case for option 2, adding a type method. Currently, we only support .adopt() and .release() as ways of changing the management style for objects. I think that a cast to unmanaged doesn't quite fit with this style of code. More importantly, .adopt() and .release() have a strong advantage over casts; they are very explicit and stand out in the code. I think a cast is sometimes too subtle.

Consider this snippet, using a cast to unmanaged

class MyClass {}

{
  var a = new MyClass();
  var b = a : unmanaged;
  // do something with b
  delete a;
} // a will be deallocated at end of scope, causing a double free

This is a programmer introduced bug and probably not what was intended. Using a method call, this could be var b = borrowed.toUnmanaged(b);. When reading code, this sticks out to me more than just a cast.

mppf commented 11 months ago

Add a new type method on borrowed which does the same thing

Casting to unmanaged seems simple enough to me, but if we add a type method, IMO it would make more sense for it to be a type method on unmanaged. E.g. unmanaged.get(a.borrow()).

bradcray commented 11 months ago

For some reason, and it may just be the nature of unmanaged being...well, unmanaged... a type method seems a bit overwrought to me for the borrow->unmanaged use case. But I suppose the main downside of using a cast is that it'd be an outlier amongst conversions between management styles?

mppf commented 11 months ago

But we can still cast to borrowed, right? IMO casting to unmanaged is more similar to that than a something like cast from owned to shared.

jabraham17 commented 11 months ago

But we can still cast to borrowed, right? IMO casting to unmanaged is more similar to that than a something like cast from owned to shared.

Yes we can still cast to borrowed, since we can implicitly convert to borrowed we must be able to cast.

Casting from owned to shared is explicitly changing the lifetime of the object, the owned object must be now dead if the shared is to take ownership. Whereas the cast to borrowed is essentially just an implicit borrow written explicitly. The name borrowed strongly implies that this object is not ours and we are not responsible for cleaning it up, we are just using it.

The cast to unmanaged is very similar to the cast to borrowed since it doesn't change the lifetime of the object, however when reading the code there is a big difference. The name unmanaged implies that we own the object and the compiler will not manage its lifetime, we must clean it up (or leak). But for an unmanaged object resulting from a cast to unmanaged, this is not the case.

mppf commented 11 months ago

The name unmanaged implies that we own the object and the compiler will not manage its lifetime, we must clean it up (or leak). But for an unmanaged object resulting from a cast to unmanaged, this is not the case.

I don't feel that way about unmanaged. I view it more as an indication that the compiler should be "hands off" and that the user should know what they are doing. Programs written to use unmanaged can and do create variables storing a pointer that should never be deleted, because some other variable storing the pointer will do that. It just means it is up to the user to figure it out.

jabraham17 commented 10 months ago

While discussing this issue in the module stabilization meeting today, @jeremiah-corrado gave a third approach. Instead of a cast to unmanaged or a method to convert to unmanaged, we could use an attribute to disable lifetime checking.

A primary use case for getting an unmanaged handle to a managed class is to sidestep the lifetime checker, but rather than giving users an unmanaged instance we could just directly given them control over where the lifetime checker is used. We already have an internal feature for this, pragma "unsafe", which disables lifetime and nilability checking. This pragma can be applied to whole functions or aggregate types, or to individual variable definitions.

I think a really good use case example for this is in our testing system. test/classes/nilability/while-var-2.chpl and test/classes/nilability/while-var-3.chpl both have a loop that looks like this.

  pragma "unsafe"
  var curr = list: borrowed class?;
  while const cur = curr {
    write(" ", cur.elm);
    curr = cur.next;
  }

Since our lifetime checker incorrectly fires for this loop, its necessary to have pragma "unsafe".

Another solution in previous versions of Chapel was to write the following, with a cast to unmanaged

var curr = list: unmanaged class?;
  while const cur = curr {
    write(" ", cur.elm);
    curr = cur.next: unmanaged class?;
  }

Of the two, I think the pragma "unsafe" is a lot cleaner.

One way to make this a user-facing feature would be to make have a stable attribute that users can apply to a borrowed object. A few proposals

@unsafe
@disableLifetime
@noLifetime

I think another advantage to this is that it makes the programmers intention clearer to someone just reading the code and to us as developers of the language. For example a static analysis could identify uses of a @unsafe. This allows users to quickly identify places in their code where they opt out of the lifetime checker explicitly (rather than implicitly with a conversion to unmanaged) and allows us to identify pain points with the lifetime checker for future improvement.

mppf commented 10 months ago

IMO, having a user-facing version of pragma "unsafe" is not a replacement for having a way to convert a borrow to an unmanaged. (but, a user-facing pragma "unsafe" is probably a good idea for other reasons).

Here is my thinking.

Sometimes, we have code that really doesn't work well with owned/shared/borrowed/the lifetime checker. And, sometimes, we have developers that have a C mindset and they just really want to manage their own memory. Of course, we can simply turn off the lifetime checker for part of this code (either with an equivalent to pragma "unsafe" or just by using unmanaged). The trouble comes up when we need to have an object transition between the managed memory world and this code.

For example, suppose we have a graph-like data structure that uses unmanaged to store nodes, but it is documented as not owning any of the pointers it is given. In other words, it never calls delete on these unmanaged objects.

Now, suppose that a user wishes to use this graph-like data structure. Does the unmanaged-ness need to infect all of their code? Well, since the graph-like data structure is effectively using borrowed by another name, it would be safe for an application using the data structure to (for example) create an array of owned Nodes and then give these to the data structure.

Taking it a step further -- the graph-like data structure might be written with an API that doesn't use unmanaged at all. Maybe it accepts a borrowed. That would be good for encapsulation, but to work, it will need to convert the borrowed to an unmanaged internally.

IMO the most Rust-like strategy here would be to:

have some kind of "unsafe" code region decorator
only allow unmanaged within the unsafe code
expect libraries to hide their unsafe code regions within safe APIs

I don't think we can implement this Rust-like strategy without some way of converting from borrowed to unmanaged. Of course, we could make that thing illegal outside of some kind of "unsafe" block if we wanted to.

jabraham17 commented 8 months ago

Noting in an offline discussion, we decided to restore the cast to unmanaged from borrowed. However, we are going to disallow the implicit borrows from owned and shared for only this case. See the example code below.

var myOwned = new MyClass();

myOwned: unmanaged; // not allowed
myOwned.borrow(): unmanaged; // allowed

var myBorrowed: borrowed = myOwned.borrow();
myBorrowed: unmanaged; // allowed