vapourlang / vapour

Typed superset of R
http://vapour.run
Apache License 2.0
190 stars 3 forks source link

Vapourize from R syntax #65

Closed b-rodrigues closed 1 month ago

b-rodrigues commented 1 month ago

Would/Should it be possible to write the following vapour code:

create = function(name: char): person {
  return(person(name = name))
}

so essentially starting from usual R syntax, and "vapourize" just enough (so only add types as a start, and then if needed "vapourize" more). This could make adoption from R users faster, what do you think?

JohnCoene commented 1 month ago

There are two things you change from how the syntax is currently in vapour:

  1. The function declaration
  2. return() as a function

Return

Taking the points in reverse order, changing return from a function call to a keyword seems pesky but I think it's valid I think. Another change this brings up is the fact that in Vapour the return is mandatory (expect if the function returns a null type).

In passing, the reason for a mandatory return keyword is that 1) it makes code far more readable (many languages enforce it), and 2) it makes it clearer what the function returns.

What does the function below return?

foo <- function(){
  x <- list()
  x$name <- 2
}

Now, an experienced R user like you may not write a function like that but it feels weird that the language permits I think. Mandatory return removes that.

Then return as function makes a mess of the code sometimes I think. The pattern of early return is underutilised in R I think but it makes code vastly more readable where R user will tend to favour if - else - if - esle - ... (I might be wrong but this is impression I get)

foo <- function(x = TRUE) {
  if(x) {
    42
  } else {
    2
  }
}

When the code below is far more readable.

foo <- function(x = TRUE) {
  if(x) {
    return(42)
  } 

  return(2)
}

I'm not clever enough to come up with that stuff, it's all nicked from Go (known for readability), Matt Ryer has a great post on that.

To finally tackle the initial point, return as a keyword, well return() makes it often less readable for me depending on what is returned.

foo <- function(x = TRUE) {
  if(x) {
    return(\(x) {
      !x
    })
  } 

  return(\() {
    2
  })
}

The above is code I can no longer "glance" as Matt describes in the post linked above.

Function declaration

The reason for changing the function declaration is far less opinionated though. I see nothing wrong with the way we declare functions in R.

foo <- function(x) x  + 1

The above is totally fine but it becomes impossible to understand when dealing with methods.

foo.data.frame <- function(x) x  + 1

Is the above a method foo on data.frame or a method foo.data on an object of class frame. It's actually impossible to know for sure. At run time you may be able to guess from what is present in the environment but it's clumsy. And again, readability now now great either. I'm fairly confident it's a mistake R made: either change the way we declare methods or not allowing . in identifiers.

Idiomatic Go looks strangely similar to R, Go essentially has something very similar if not identical to S3 method dispatch, so to "solve" that problem I thought I would nick their syntax.

func foo(x: int): int {
  return x + 1
}

func (x: int) foo(): int {
  return x + 1
}

I think the above is far clearer, I can clearly see what is a function, what is a method and on what it will dispatch.

Again these are my thoughts, feel free to push back on them.

I should really put this in the documentation...

b-rodrigues commented 1 month ago

Your points make total sense, and agreed, they should be in the docs :)

When I opened this issue, I was thinking about how Cython allows you to write standard python code that you "cython-ize" more and more to make it look more and more like C, improving performance as well. See for example this blog post: https://www.machinelearningplus.com/python/how-to-convert-python-code-to-cython-and-speed-up-100x/

I was thinking that maybe something similar could be done here, to ease the adoption of vapour

JohnCoene commented 1 month ago

Hah, in my rambling I managed to not answer the initial question...

The point that changing the syntax the way Vapour does will force R developers to make more efforts in order to adopt the language is certainly valid.

Vapour is still in its infancy, it's going to much more work to get to where I want it to be. But the aim is to arrive at something genuinely nicer to work with than R itself. I'm not trying to sounds arrogant but if it's not to arrive at something better than R than what's the point?

The reasoning being that when Vapour is at that stage R developers will see through the, I would argue, minor changes in the syntax.

Another "rationale" behind is that once you've changed the function declaration "in R" you've actually changed a lot already since evidently there are a lot of functions in a functional programming language... And thus by that point it didn't feel wrong changing a handful of other things syntax-wise since I believe it results in much more readable/maintainable code.

I have a feeling R users can get over the new function declaration syntax quicker than they could accept Vapour taking away the <- assignment :)

JohnCoene commented 1 month ago

I've added this to the documentation site (next version) and thus will close.