Feature request: a Chapel equivalent to Fortran's select type statement

chapel-lang / chapel

a Productive Parallel Programming Language

https://chapel-lang.org

Other

1.78k stars 420 forks source link

Feature request: a Chapel equivalent to Fortran's select type statement #10656

Open ghost opened 6 years ago

ghost commented 6 years ago

Dear Chapel team,

This is a request for a feature equivalent to Fortran's "select type" statement for Chapel, resulting from a question of mine over at Stack Overflow, which Brad was so kind to answer. The original question is here: https://stackoverflow.com/questions/51661711/is-there-an-equivalent-to-fortrans-select-type-statement-in-chapel

I have reworked the original code posted there to make the advantages of such a feature somewhat more apparent. Consider the following OOP code in Chapel:

module SomeAnimals {

  class Animal {
  }

  class Fish: Animal {
    proc swim() {
      writeln("Swimming ...!");
    }
  }

  class Bird: Animal {
    proc fly() {
      writeln("Flying ...!");
    }
  }

  class Hawk: Bird {
    proc fly() {
      writeln("Flying like a Hawk ...!");
    }
  }

  class Duck: Bird {
    proc fly() {
      writeln("Flying like a Duck ...!");
    }
  }

  class Raven: Bird {
    proc fly() {
      writeln("Flying like a Raven ...!");
    }
  }

} // module SomeAnimals

proc main() {

  use SomeAnimals;

  var anim: Animal;

  anim = new Fish();

  var aFish = anim:Fish;
  if aFish then
    aFish.swim();

  delete anim;  
  anim = new Hawk();

  var aBird  = anim:Bird;
  var aHawk  = anim:Hawk;
  var aDuck  = anim:Duck;
  var aRaven = anim:Raven;

  if aBird then
    aBird.fly();
  else if aHawk then
    aHawk.fly();
  else if aDuck then
    aDuck.fly();
  else if aRaven then
    aRaven.fly();    

  delete anim;

} // proc main

Focus in particular on the statements in main() appearing after the first delete anim. These will result in either a call to the Bird.fly() method (if the dynamic type of anim is of the generic Bird type) or to one of its overridden counterparts specified by Bird's subclasses, depending on the (sub)class of Bird anim was instantiated from (in the particular case given here, the call will be to Hawk.fly(), resulting in the printout 'Flying like a Hawk ...!', but since anim is a polymorphic variable it could, in a more general code, acquire different types at runtime) .

To cover all the aforementioned cases in this Chapel code it is presently required to declare four temporary variables (i.e. aBird, aHawk, aDuck, aRaven) and to write a rather longish if-elseif conditional. This could be avoided if Chapel had an equivalent to Fortran's select type statement, as illustrated by the following version of main, as it would be written in Fortran 2003/2008:

program main

   use SomeAnimals

   class(Animal), allocatable :: anim

   allocate(anim, source = Fish())

   select type (anim)
   type is (Fish)
      call anim%swim()
   end select

   deallocate(anim)
   allocate(anim, source = Hawk())

   select type (anim)
   class is (Bird)
      call anim%fly()
   end select

   deallocate(anim)

end program main

Notice that in the Fortran version no temporary variables, and no casting operators are required. Moreover, there is only a single "class is(Bird)" clause necessary (encapsulated in a select type statement) to cover all the cases of anim being of type Bird or any of its subclasses. By using a "type is" (instead of a "class is") clause, Fortran's select type statement can also deal with the non-polymorphic case, where one wishes to test for some specific type only, as is illustrated above for the Fish type. Moreover "type is" and "class is" clauses can appear in the same select type statement in Fortran.

A Chapel statement equivalent to Fortran's "select type" would make Chapel code equally concise.

bradcray commented 6 years ago

Hi @cfdtech -- Thanks for taking the time to file this.

Noting one thing that I may not have made clear in our SO exchange: You can write the code example you had above far more simply as:

proc main() {

  use SomeAnimals;

  var anim: Animal;

  anim = new Fish();

  var aFish = anim:Fish;
  if aFish then
    aFish.swim();

  delete anim;  
  anim = new Hawk();

  var aBird  = anim:Bird;
  // NOTE: simplified here

  if aBird then
    aBird.fly();
  // NOTE: simplified here

  delete anim;

} // proc main

That is, once you've identified that something is a bird, there's no need to figure out which specific subtype of bird it is to make it fly. Dynamic dispatch will do that for you (and the only reason you need to make it a bird at all is that Animal.fly() doesn't exist. But if it did, you wouldn't even need to do the first cast).

With this simplification/correction, the Chapel version is similarly compact as the Fortran code (14 SLOC in Chapel vs. 16 in Fortran), though the temporary variables are still a little unfortunate. Note that you don't actually need the temporary variables if you don't plan to use them again. For example, I could just write:

proc main() {

  use SomeAnimals;

  var anim: Animal;

  anim = new Fish();

  if anim:Fish then
    (anim:Fish).swim();

  delete anim;
  anim = new Hawk();

  if anim:Bird then
    (anim:Bird).fly();

      delete anim;

} // proc main

which is now down to 12 SLOC, but has some unfortunate duplicate casts. For that reason, I think your feature request still has merit, since it removes some redundancy / pointless code in exchange for a bit more structure (arguably a bit like select statements themselves):

proc main() {

  use SomeAnimals;

  var anim: Animal;

  anim = new Fish();

  typeselect(anim) {  // invented syntax
    when Fish do
      anim.swim(); 
  }

  delete anim;  
  anim = new Hawk();

  typeselect(anim) {  // invented syntax
    when Bird do
      anim.fly(); 
  }

  delete anim;

} // proc main

Now we're back to Fortran's 16 lines, but the sugar gives us the benefit of avoiding the temporary variable and the explicit dynamic cast; and the overhead of the statement itself is reduced when we want to sort an expression into multiple types:

  typeselect(anim) {  // invented syntax
   when Fish do
      anim.swim();
    when Bird do
      anim.fly(); 
  }

One thing that's a little weird about this proposal is that if you looked at one of the statements like anim.swim();, and then at anims definition, it appears that you're asking a general animal to swim when it may not be able to. So it has to be understood that anim is a shadow variable of type t when it appears within the when of a typeselect clause.

ghost commented 6 years ago

Hi Brad, and thanks for pointing out/clarifying that dynamic dispatch can be used already now to simplify the example above.

Regarding the temporary variables, I had noticed myself that it is, in principle, possible to get rid of them, at the cost of introducing multiple type casts. But I was more concerned about the possible impact on performance, that such a replacement might have, than about the introduction of these temporaries.

I concur with your observation that the case above may look a bit weird in that it asks a general animal to swim (though only after some query). However, I prefer to think of anim as a placeholder (i.e. as a variable in the mathematical sense, or a shadow variable as you say) that can accept any type of animal regardless of its special abilities, and to have any such special functionality (i.e. methods) appear only in subclasses as a means of clear separation of concerns.

bradcray commented 6 years ago

Thinking about this a bit more this morning, I remembered that we've discussed, at times, having a way to query the dynamic type of something at execution time rather than its static type. If we had such a query, it could permit us to use the normal select statement to sort objects into cases which excited me (no need to add a new statement to the language if we add a capability we already wanted!). However, then I realized that we'd still need a cast to make the call (since nothing about a normal select would cause a type conversion to take place). For example, this might end up looking like the following:

  select(anim.dynType) {  // invented query
    when Fish do
      (anim:Fish).swim();   // cast still required... bummer!
  }

I think this means that the proposal for a new "dynamic type selection statement" (my name for it) is arguably still reasonable and valuable. Just capturing this in case others went down a similar mental path.

ghost commented 6 years ago

You beat me to it, lol!

I was about to ask whether, for reasons of orthogonality in the language, it might not be better to extend the normal select statement in a way similar to what you proposed above, given that it already has the capability to check for the static type.

bradcray commented 6 years ago

It's funny that you ask. In the earliest days of Chapel, we had both select and typeselect statements. Then, at some point, we realized that the things we wanted to do at that time (w.r.t. static types) could simply use select (or if ... then ... else and .type) to get what we wanted, so we got rid of typeselect. Now I'm wondering if the person who suggested typeselect had anticipated this use case (which is why I re-used that keyword here).

If you'd be willing to do the cast within the when clause, as before, then I think pursuing the above approach would definitely be the quickest and most minimal way to get this feature.

But even if you are, since the when and the dynamic cast seem a bit redundant (they're both effectively doing the same thing), it seems a little lame. And I'm not convinced we'd want to have some sort of special case like "When a select statement is doing a dynType check, a new shadow variable of that type is created..." because it feels too fiddly and fragile. So unless someone comes up with some other cleverness that I'm not, it seems to me that either the .dynType + "cast-within-when" approach or a new statement type are the most promising ways to proceed.

ghost commented 6 years ago

I believe you should implement whatever you think is of the greatest long-term benefit for the language.

For the moment (i.e. for the purpose of simply porting some small application to Chapel), I think I could get along with what the language already provides in this respect, although I have no idea how the repeated use of such type casting might affect performance in more complex, real world applications.

ghost commented 6 years ago

This may be slightly off-topic and a rather wild idea, so bear with me, but upon thinking some more about what I'd love to see in the language over the long term I wondered about the following. Why not to go a step further and get rid of both, all the casts and all the type guards? Why should a user not be able to code the above example simply as follows?

proc main() {

  use SomeAnimals;

  var anim: Animal;

  anim = new Fish();
  anim.swim();
  delete anim; 

  anim = new Hawk();
  anim.fly();
  delete anim;

} // proc main

This is (in a somewhat paraphrased way) how it would be done in Python. Of course, Python makes this possible using duck-typing and its dynamic type system. Such dynamic duck-typing would be unattractive for Chapel due to the runtime overhead that this incurs. But what about having some form of compile-time, "static duck-typing" in Chapel (or "structural typing" as it is called here: https://en.wikipedia.org/wiki/Structural_type_system ). Of course, this would be more involved to implement, but one might perhaps be able to borrow some ideas from the Go or OCaml languages in this respect.

I think some form of "static duck-typing" would fit wonderfully into the general theme that Chapel strives to be as easy to use as Python, and as fast as (and in this case actually faster than) Fortran (due to the lack of any type guard statements in the above example). Moreover, as far as I can see, this would also fit nicely into the general theme of compile-time type inference used in Chapel.

bradcray commented 6 years ago

Personally, I don't think we'd want to go so far as to let any variable of type Animal fly or swim if they were unable to. For a case as simple as yours above, it is conceivable that the compiler could figure out the types of the variables, but once you start putting in some control flow that can't be resolved statically, the compiler would need to be conservative where the choices are either (a) conservatively let things pass and get execution time errors ("This fish can't fly!") or (b) conservatively have things fail and frustrate the programmer that "knows better". The other problem with such an approach is that when a language definition starts trying to say "this is legal when the compiler can figure it out" it causes problems when the compiler is inconsistent or multiple compilers are not consistent with one another.

A related thing to know about that maybe hasn't been clear from the examples we've been talking about is that if you don't declare anim as an Animal but just start initializing it with a specific animal type, the Chapel compiler will infer it to be that type (or any of its subclasses). So for example, a different pattern that does give you exactly what you're hoping for would be this (where I've introduced two scopes just so that I can have two different variables named anim):

proc main() {

  use SomeAnimals;
  {  
    var anim = new Fish();  // inferred to be a Fish
    anim.swim();
    writeln(anim.type:string);
    delete anim; 
  }

  {
    var anim = new Hawk();  // inferred to be a Hawk
    anim.fly();
    writeln(anim.type:string);
    delete anim;
  }

} // proc main

The key here is that I haven't said var anim: Animal = new Hawk(); but have relied on type inference to say "You're initializing this with a Hawk, so it must be a Hawk."

This approach can help with the pattern you're looking for with a lot of programs, though with others (e.g., ones where you need a truly arbitrary animal) it would not.