evincarofautumn / kitten

A statically typed concatenative systems programming language.
http://kittenlang.org/
Other
1.1k stars 42 forks source link

Go all in on local vars #205

Open trans opened 6 years ago

trans commented 6 years ago

Thinking about stack manipulation, what you have said about it, and Kitten, I got this gut feeling that Kitten should go all in on local variables. Thinking it through a little, it would make the language cleaner and less confusing. Local vars could be in the type signature (or do a more Haskelly kind of thing), and you would no longer have to fuss with un-postfixing operators with parentheticals. And it's kind of cool, because it is still concativative and stack-oriented, but the stack "peals-off" on each word call.

evincarofautumn commented 6 years ago

What specific changes would you suggest—could you give any examples? I’m not sure what you mean by “go all in”.

Some people are interested in Kitten because they come from Forth, so they’re accustomed to thinking with stacks; others are coming from Haskell, OCaml, Rust, or C++, where they prefer to use locals most of the time. My own code in the common vocabulary and examples already prefers to use some locals over stack manipulation, except for simple dataflow like this:

instance not (Bool -> Bool):
  // -> x; if (x) { false } else { true }
  if { false } else { true }

instance ~ (Bool, Bool -> Bool):
  // -> a, b; if (a) { b not } else { b }
  swap if { not }

Some Forth people who’ve tried Kitten have avoided locals entirely, and I want to support that use case even though it’s not my personal preference.

A notation for named parameters would be acceptable to me, like:

// This:
define if_then_else<T> (x as Bool, t as T, f as T -> T):
  if (x) { t } else { f }

// Desugars to this:
define if_then_else<T> (Bool, T, T -> T):
  -> x, t, f;
  if (x) { t } else { f }

// Which is the same as this:
define if_then_else<T> (Bool, T, T -> T):
  rot not if { swap } drop
trans commented 6 years ago

Some Forth people who’ve tried Kitten have avoided locals entirely, and I want to support that use case

That was my initial preference, actually, coming to Kitten with a Forth mindset and thinking the optional local var support was, at best, noise and antithetical to the nature of a concatenative stack-based language. But now I know I was just being stubborn, and missing the bigger picture about Kitten.

Forth doesn't have local vars b/c the design goals for the interpreter/compiler is about being pretty much as simplistic as possible (a crusade Chuck Moore is still on, check out colorForth). The lack of local vars also helps Forth avoid some of the naming problem (the hardest problem in programming), which it really needs because conversely it fails in this regard due to its use of global vars for heap allocations and the fact that it is untyped, so it has many more word names, e.g. + and f+.

Kitten doesn't have these issues. Since Kitten is typed (nor has global vars), it's already in a better position naming wise than Forth thanks to polymorphism. And it is already "in the business" of naming things in argument signatures (where as it is purely documentation in Forth). So, the complexity sacrifice in the compiler has already been made, and with good result -- definitions are much more clear and readable -- which, I'm sure, is the reason you provided local var support too.

So by "all in", I literally mean that all signatures should go ahead and have argument names. The cases where the lack of them seems better, are few, and differ only minorly, compared the the number of cases where having them greatly improves code clarity/readability.

And notice, your examples miss some nice potential simplifications.

instance not (x as Bool -> Bool):
  x if { false } else { true }

instance ~ (a, b as Bool -> Bool):
  b a if { not }

define if_then_else<T> (x as Bool, t, f as T -> T):
  x if { t } else { f }

It's all still 100% stack-based, but it makes swap, rot, etc. rarely, if ever, needed. It also could possibly make converting infix operators to postfix, e.g. (+), unnecessary (not completely sure about this last part, but I haven't come up with of a contrary case yet). Speaking of which, infix notation is probably a hard pill for most Forth comers anyway.

(Side note: To further address the naming problem, Kitten could encourage as "good practice" the use of small, typically single letter, mnemonic if possible, variable names.)

Oh, one other thing, the use of -> for both separating return type in signatures and assigning locals is kind of a bad juxtaposition, as it makes one think they are somehow related when first learning about the language.

evincarofautumn commented 6 years ago

I’m definitely open to allowing this style with named parameters, I just don’t really want to enforce it. Essentially everything either is, or could be, implemented as sugar for point-free code. I still lean toward point-free as a good default for small definitions, and pointful for longer definitions where the dataflow is more complex.

infix notation is probably a hard pill for most Forth comers anyway

Yeah, I know—infix operators represent a set of tradeoffs, and I’ve come down squarely on the side of allowing them. They’re less familiar to Forth people and infringe on concatenativity, but in exchange they’re more familiar to the majority non-Forth people, and help get them in the door. They also allow literal transcription (or even copying & pasting) of math notation and code from other languages, and (I hope) nudge people toward dataflow-oriented thinking rather than strictly stack-oriented thinking. It is unfortunate that the (+) notation for using an infix operator as a postfix word is somewhat noisy, but this syntax has precedent in a few other languages, notably Haskell.

One design I considered in the past was to have a word’s definition determine whether it’s infix or postfix by default, and use parentheses to switch between these two:

// Infix operator
define (+) (…) { … }

1 + 2    // call infix
1 2 (+)  // call postfix

// Postfix operator
define @ (…) { … }

2 3 @    // call postfix
2 (@) 3  // call infix

// Postfix word
define func (…) { … }

3 4 func    // call postfix
3 (func) 4  // call infix

// Infix word
define (beside) (…) { … }

4 beside 5    // call infix
4 5 (beside)  // call postfix

However, that makes it much harder to read unfamiliar code, because how it’s parsed depends on definitions that are far away from the use sites. So I think the current “named = postfix, symbolic = infix, parentheses = override” is a good tradeoff.

the use of -> for both separating return type in signatures and assigning locals is kind of a bad juxtaposition

I agree, this is counter to Kitten’s rule of thumb that each symbol should have the same (or similar) meaning regardless of where it appears. I should probably change one of them (most likely function arrows) to =>/.

trans commented 6 years ago

Two thoughts about local assignment. First, have you considered reversing the arrow and making it postfix? But I suppose in that case it would require quotation of some sort, i.e.

define if_then_else<T> (Bool, T, T -> T):
  { x t f } <-
  if (x) { t } else { f }

And that wouldn't fit in with the parenthesis needed for postfix operators -- it would be (<-) instead, which sucks.

My second thought was that you might not need a notation at all. Just giving an undefined var name could be enough.

define if_then_else<T> (Bool, T, T -> T):
  x t f
  if (x) { t } else { f }

That would be nice, but then it has potential name conflicts with words. So I suppose that can't work either.

Just thinking out loud here a bit.

Camto commented 6 years ago

I'd just like to add that even Haskell allows for a point-free style (kinda). You can define a = b . c even though it actually is a x = b (c x).

trans commented 6 years ago

Local vars go on the return stack, is that right?

evincarofautumn commented 6 years ago

Local vars go on the return stack, is that right?

In the latest version of the compiler that I’ve been working on (nothing pushed publicly yet), both locals and the data stack generally live in registers, or spill to memory on the return stack, just like in a C compiler.