Generate a Big Picture through General Project Questions

groundwater commented 11 years ago

Whenever I have a new project, I find communicating the big vision important to engage other people. I thought I would ask questions from an outside perspective, and you/we could develop the answer. This would help myself and others understand the project.

What need is jack trying to fulfil? i.e. what is one use-case where jack would be deployed.
Without any IO, how does jack do anything?
Why not module system?
Is the language object-oriented?
Will there be functional-programming constructs? i.e. first-class functions, higher-order functions?
What language constructs did you add that are missing from Javascript?

creationix commented 11 years ago

What need is jack trying to fulfill? i.e. what is one use-case where jack would be deployed?

My first target is as a first programming language and a language that can run where people are. I'll have two backends, one for js runtimes and one for native runtimes (reusing luajit's vm).

Without any IO, how does jack do anything?

That's up to the people binding the language. It's meant to be embedded in projects.

Why not module system?

I may add one, but I'm not there yet.

Is the language object-oriented?

That depends on what you mean. There are certainly no "classes" and no dynamic scope (like "this" in js). The entire runtime is structured around functions, closures, and data structures.

Will there be functional-programming constructs? i.e. first-class functions, higher-order functions?

Yes, functions are values that can be passed around and have lexical closure .

What language constructs did you add that are missing from Javascript?

I'm more of removing than adding. Though one important addition will be green-threads that make it so you don't need callbacks for I/O operations. I'm also considering some real share-nothing threads to use more CPU cores.

groundwater commented 11 years ago

How do I, as a developer, bind IO to the language? (Just a general overview)

Well, for example, if you're using jack in a browser context, you bind to the browser's native I/O functions like access to the DOM and other APIs. These bindings will probably be written in JavaScript according to some interface I have yet to design. It won't be that different from writing C++ bindings to V8 or C bindings to Lua, except you'll be writing your bindings in JavaScript (for the browser), or lua/c for that native platform.

It's meant to be embedded in projects.

Does this mean in an embedded system, like a car/stereo, or a scripting environment built into another program like TextMate?

I mean like how V8 can be embedded in node.js or the browser, or how I embed luajit in luvit.io. It could also be embedded like lua is done with many games, or embedded in webpages to allow for a sandbox for visitors to play with.

That depends on what you mean. There are certainly no "classes" and no dynamic scope (like "this" in js). The entire runtime is structured around functions, closures, and data structures.

Does this mean no ob.method syntax, or something more like python where methods are defined as name( self, arg1, arg2) where the this object is passed in directly?

There is obj.prop syntax, but the primary way to get at state will be via closures. It's possible to share functions among several states if you pass in the state as the first variable, but there will be no special language feature for this.

Yes, functions are values that can be passed around and have lexical closure .

Will native objects like List be built in a functional way? i.e. with cons.

I'm not sure yet if I want lists to be mutable or not. For now, I'll follow JS and let them be mutable and re-sizable.

I'm more of removing than adding. Though one important addition will be green-threads that make it so you don't need callbacks for I/O operations. I'm also considering some real share-nothing threads to use more CPU cores.

Are there any similarities to Erlang processes, or Go co-routines?

Sure, some. But nothing intentional.

creationix commented 11 years ago

For an example of how to write OOP code in Jack, see the dragon example, ported from the ruby dragon in "Learn to Program". https://github.com/creationix/jack/blob/master/samples/dragon.jk

groundwater commented 11 years ago

Ahh! I see, everything is a big closure. I often have used this method in JS. So objects are just maps without any this scope.

Next question, why choose { arg1 | ... } syntax over the more common f( arg1 ) syntax?

creationix commented 11 years ago

Syntax is just syntax. I chose {arg1, arg2| ... } for function literals because it's very terse.

creationix commented 11 years ago

I might be willing to change function syntax to be (arg1, arg2){...} if it works in the grammar.

(){
  -- self calling function symmetry
}()

creationix commented 11 years ago

Though there is less need for self-calling functions since every block has it's own local scope:

for name in names {
  -- variables created here don't pollute parent scope
}

creationix commented 11 years ago

The other reason I like this syntax is because I want to add curry syntax:

{callback|chunk| ... }

Is like the following JS:

function (callback) { return function (chunk) { ... } }

groundwater commented 11 years ago

I am a big fan of currying.

I'm just thinking of ways to explain things succinctly. Scala has the concept of a block { ... } and every block has a return type. Since there are no types here, we can just say "every block is also a value; the value of whatever the last statement is". E.g.

var y = 1
var x = {
    y + 1
}
// x == 2

In Scala, a block looking like { x => x+1} is a function. A similar explanation might work here where block values can be functions (since functions are first-class):

var y = { x| 
    x + 1
}

Once the block concept is established, you can explain all sorts of other crazy stuff.

for name in names { n| 
  // do something
}

The for-loop only requires a function at the end, which is defined as a block whose value is a function.

I'm mostly thinking out-loud here. Feel free to disagree, or point me straight.

creationix commented 11 years ago

Well, there is a semantic difference between what JavaScript and Ruby call blocks and functions. The main difference is around control-flow keywords like return, continue, and break. You can't return from an if statement, but very often you want to return from a function conditionally (as the body of an if statement). So If bodies can't be functions then unless we have some sort of named return like javascript has named breaks and named continues. If there is a way to make named returns work, then I'm all for that. It combines the ideas of block and function nicely. Less to learn is good for this language.

groundwater commented 11 years ago

For myself, I have only ever benefited from abandoning return statements, but it's one of those debates that had no consensus.

I did the Martin Odersky course in functional programming last fall. We were encouraged not to use return, so I can very much attest to the fact that a lack of return does not limit functionality. It is just a matter of style, or preference.

Python loops using functions and generators. The way you break a loop is by throwing an exception. I can't remember the "functional" way of breaking a loop early.

groundwater commented 11 years ago

What are your thoughts on exceptions?

I did some more reading on how Haskell does loops, which it doesn't. They use something like Iteratees, which are complicated to say the least, at least for the average user.

creationix commented 11 years ago

I'd love to avoid early return, break, continue, and exceptions if possible because they complicate control flow. But I'm having a hard time finding alternatives that are accessible to the typical programmer.

groundwater commented 11 years ago

Here are some thoughts on eliminating return, continue, break:

If all statements return a value, including if/then statements it's much easier to eliminate return.

func1 = {condition|
  if ( condition ) {
    1
  }else{
    2
  }
}
// func1(true) === 1
// func1(false) === 2

For-loops are a tricky subject. Scala just returns something called Unit which is basically undefined. However they also support for-yield which changes the return type to a List of all yielded values. (Yield is not like javascript yield)

The break statement is tricky, because it represents an out-of-band control-flow. You either need exceptions, or an iteratee to handle break. An iteratee approach might be something like:

for( i in x ) { i, continue, done |
    if( i < 0 ){
      continue;
    }else{
      done;
    }
}

Since if has a value, it is either the value of continue or done. Meaning the function returns either the continue object or done object. Each of these is an iteratee component that tells the loop to either continue or halt. The user can also return a custom object re-defining the loops behaviour. Simple (maybe), and flexible.

creationix commented 11 years ago

I already have most statements being expressions where possible (the exceptions being obviously return and error since they break control flow.)

Also my loops used to be more powerful. Before there were two variants known as for..in..if and map..in..if.

for name in names if name != "Bob" {
  -- Bob will be skipped
}
--> null

map name in names if name !== "Bob" {
  name
}
--> new array of names without "Bob"

I would love yieldable for loops so that I didn't need the map...if combo. Yieldable loops are even more powerful because some iterations can yield multiple values.

filtered = for name in names {
  if name !== "Bob" {
    yield name
  }
}

I could possible use the same keyword for both, As an optimization, I could make yield a no-op if the expression result is ignored so that normal loops don't keep allocating empty arrays.

But if I consolidate blocks and functions, then any arbitrary function would need to be able to yield and I essentially now have generators. (not that that's a bad thing, but it does require extra consideration)

willconant commented 11 years ago

In my own hobby experiments, I have frequently been tempted to consolidate blocks and functions. There seems to be an extremely compelling isomorphism there, but it always turns out to be a mirage.

willconant commented 11 years ago

Thinking about exceptions:

I understand the temptation to eliminate them, but what will you end up doing about runtime errors? It is not uncommon for code to have broken paths or edge cases that should not, nevertheless, crash the entire program.

Think about a web server where 1 in 100,000 requests executes a code path that divides by zero or expects a different data type than it receives. Crashing the whole web server is seldom ideal. Usually you still want to catch the exception at the level of the original request, log it, and send back a 500.

In my opinion, the nasty thing about exceptions in most languages is the awful try/catch syntax which pushes handling of expected errors farther and farther away from the source of those errors and introduces unnecessary braces. I've always liked the idea of a special assignment operator that stops error propagation:

# normal assignment, error will propagate
x = y / 0

# catch-y assignment, error will be assigned to err
# (also, ascii needs more symbols)
x, err @= y / 0
if err { ... }

willconant commented 11 years ago

Of course, that would require a more elaborate unwinding mechanism than you may want. You could always limit such special assignments to only allow function calls on the right-hand side.

creationix commented 11 years ago

@willconant lua uses multiple return values and a built-in assert handler for some error handling:

print "enter a number:"
n, err = io.read("*number")
if not n then error("invalid input " .. err) end

But then in practice, you'll often throw the error anyway if you can't handle it. There seem to be two cases or errors in practice. First is those that are expected sometimes and you know how to handle. Like using fs.stat in node.js to check if a file exists. It will return an error in the callback when the file doesn't exist. The second type is a completely unexpected error or illegal use of arguments where there is no intelligent way to handle it. Then a scoped handler needs to abort that part of the program that caused the error (like sending a 500 to an http client). But there should also be a way to clean up any resources that were allocated in that request's lifetime as well.

I'm adding green threads to the language in hopes that they can help encapsulate all these resources and contain them should the thread need to abort.

groundwater commented 11 years ago

Going back to for loops for a moment, I would give serious thought to just straight out copying the for-yield implementation from Scala. I gave it some more detailed reading, and while I can't claim to know all the details, I can say a few cool things.

it supports a generic if-style filtering
it supports complex loop logic via Iteratees
for the beginner user, the syntax is simple, and familiar
it even handles async promises

The basic syntax is as follows:

for ( iterator ) { body }

basic

for( i <- list ) { console.log(i) }

filter

for( i <- list; if i<1 ){ console.log(i) }

yield

for( i <- list ) { yield (i+1) }

comprehension

for( i <- list; j <- calculate(i) ) { yield j }

promises

var accountPromise = for {
  name <- getUserPromise(uid)
  account <- getUserAccount(name)
} { yield account }

The cool part is that the for-loop machinery never changes. It's just a clever application of special containers called Monads and Applicative Functors. Crazy names, but for beginner users they will never need to access them directly. It's a robust machinery upon which cool things can be built. The machinery is also 100% functional, involving no mutable states.

The for loop is just syntactic shorthand for a series of method calls.

for( i <- list ) { yield (i+1) }
// becomes
list.map { i | i + 1 }

Any list object with a .map method can be used here.

willconant commented 11 years ago

I'm definitely not suggesting that errors should just be returned as the second value of functions. In my opinion, errors should propagate up by default, and any new language should set that precedent in its standard library.

(Disclaimer: I realized that none of this may match your goals for the Jack language. I just figured I'd share some of my recent musings about exceptions in case you find them useful.)

I think that ugly try/catch syntax and even uglier error objects lead to a pathological situation where module writers try to figure out which exceptional conditions should be thrown and which exceptional conditions should be returned. Let me give you a concrete example from my own experience with CouchDB.

Couch uses optimistic locking to resolve update conflicts, consequently, it is fairly reasonable from the perspective of client code to expect and handle document update failures. Unfortunately, in most languages throwing an error when there is an update conflict is really clunky:

try {
    couch.putDoc(myDoc);
}
catch (err) {
    // how do I tell if the error was a 409 and not something irrecoverable?
    // let's just pretend the error has a type
    if (err.type == 'conflict') {
        // handle conflict
    }
    else {
        throw err;
    }
}

That's an awful idiom for a fairly common case. In my own couch module, I always return the new revision of the couch document on success, and I return null on an update conflict, and I throw an error in every other case:

rev = couch.putDoc(myDoc);
if (!rev) {
    // handle conflict
}

Now I don't have to worry about the other exceptions because they'll just propagate up, but honestly, this is a really warty solution. What if there are two "recoverable" exceptional conditions? Even in this case, things are fairly lame because at least half of the client code that calls putDoc can't recover from an update conflict. In fact, in real life my putDoc function accepts a flag that tells it whether or not to throw on conflict:

// throws on conflict
rev = couch.putDoc(myDoc);

// returns null on conflict
rev = couch.putDoc(myDoc, true);

If a language had a cleaner way to trap errors and a well conceived idiom for error meta-data, modules could ALWAYS throw errors:

// any exceptional case propagates up including update conflicts
rev = couch.putDoc(myDoc);

// errors are trapped by client code (putDoc doesn't need to guess caller intention)
rev, err @= couch.putDoc(myDoc);
if (err.type == 'conflict') {
    // handle conflict
}
else {
    throw err;
}

That could be cooler:

// recover is like a switch statement where the default is to re-throw the error
rev, err @= couch.putDoc(myDoc);
recover(err) {
case 'conflict':
    // handle conflict
}

I'm being an astronaut now, but in this sort of system, you'd probably want re-thrown errors to lose their type-y meta-data by default, so an un-handled conflict propagates as a generic error.

creationix commented 11 years ago

@jacobgroundwater Interesting, but I don't want to stray too far from javascript and lua semantics. This will most likely be a procedural language with mutable state everywhere.

So the main problems I'm trying to solve are:

1 - How can I early exit from a function based on some condition? This needs to work inside if block bodies and loop bodies.

2 - I would like to merge blocks and functions while still keeping #1 possible.

3 - How will error propagation work? This is a very runtime-driven (dynamic) language, there will be many runtime errors.

creationix commented 11 years ago

In the end, all code becomes assembly which has just (un)conditional gotos for control flow. The question is how far from that abstraction do we want to be?

groundwater commented 11 years ago

How can I early exit from a function based on some condition? This needs to work inside if block bodies and loop bodies.

Returning is easy, breaking is hard.

// function example
{ i |
  if i > 0 {
    // do something
  } // otherwise return without doing anything
}

// loop example
list.forEach { i |
    if CONDITION {
      // do something
   } // otherwise return
}

To break from a loop, you need to communicate out of band information. Either you need another means to signal the loop, like continue/throw or each iteration needs to return an object indicating what to do next.

I have already weighed in my opinion, I think Haskell/Scala did it right, and that you can use their underlying machinery without messing up your desired semantics.

As mentioned before, another solution is to use an exception. Before anyone starts objecting, Python does this and it works fine for them. I would argue there really is no difference between break and throw. The for-loop catches the throw. At most it costs you a stack-trace, but I bet Python has a lightweight exception that omits the trace.

I suppose I could summarize by asking if you're looking for in-band or out-of-band control?

creationix commented 11 years ago

I guess the big difference between a block and a function is you can pass a function around as a first-class value and so it can be run outside it's original lexical scope. What would named returns look like?

function closure(a) {
  return function (b) {
    return  a + b
  }
}
closure(4)(3) // -> 7

But what if the inner function tried to us a named return to jump out and return from closure

function closure(a) {
  return function (b) {
    // made up syntax that does a "named" return much like a named break or named continue 
    return a + b to closure;
  }
}

This is obviously bad because the original closure function already returned the closure itself. It can't return again unless the code is re-wound to the point where closure originally returned. But there are cases outside of closures where this behavior is desired and that's what blocks do naturally.

function fib(i) {
  if (i <= 2) {
    return 1
  }
  return fib(i - 1) + fib(i - 2);
}

We don't think twice about how this is doing crazy things and acting like a goto and jumping past the block in the if body. What if if was a function and not a keyword?

function fib(i) {
  if(i <= 2, function () {
    return 1 to fib
  })
  return fib(i-1) + fib(i - 2)
}

I'm looking into how Ruby does this which is inspired by Lisp according to Matz (the creator of ruby).

As I understand it ruby "blocks" and "procs" can "continue" or "break" to unwind the stack various amounts. If you "return" from block/proc, it will return from the lexical method/lambda just like returning from a javascript block.

The difference from JS, however is ruby blocks/procs are closures and can be passed around as first-class values. (You can never get a reference to a JS block, only JS functions).

creationix / jack

Generate a Big Picture through General Project Questions #3