keean / zenscript

A trait based language that compiles to JavaScript
MIT License
42 stars 7 forks source link

Compiler Implementation #18

Open keean opened 7 years ago

keean commented 7 years ago

Discussion about the ongoing implementation of the compiler.

keean commented 7 years ago

The parser now has an implementation of the indentation aware parser described in this paper: https://pdfs.semanticscholar.org/cd8e/5faaa60dfa946dd8a79a5917fe52b4bd0346.pdf

Here's the implementation of the indentation parser:

    function IndentationParser(init) {
        this.indent = init
    }
    IndentationParser.prototype.get = function() {
        return this.indent
    }
    IndentationParser.prototype.set = function(i) {
        this.indent = i
    }
    IndentationParser.prototype.relative = function(relation) {
        var self = this
        return Parsimmon.custom((success, failure) => {
            return (stream, i) => {
                var j = 0
                while (stream.charAt(i + j) == ' ') {
                    j = j + 1
                }
                if (relation.op(j, self.indent)) {
                    self.indent = j
                    return success(i + j, j)
                } else {
                    return failure(i, 'indentation error: ' + j + relation.err + self.indent)
                }
            }
        })
    }
    IndentationParser.prototype.absolute = function(target) {
        var self = this
        return Parsimmon.custom((success, failure) => {
            return (stream, i) => {
                var j = 0
                while (stream.charAt(i + j) == ' ') {
                    j = j + 1
                }
                if (j == target) {
                    self.indent = j
                    return success(i + j, target)
                } else {
                    return failure(i, 'indentation error: ' + j + ' does not equal ' + target)
                }
            }
        })
    }
    IndentationParser.prototype.eq  = {op: (x, y) => {return x == y}, err: ' does not equal '}
    IndentationParser.prototype.ge  = {op: (x, y) => {return x >= y}, err: ' is not equal or greater than '}
    IndentationParser.prototype.gt  = {op: (x, y) => {return x > y}, err: ' is not greater than '}
    IndentationParser.prototype.any = {op: (x, y) => {return true}, err: ' cannot fail '}

This is what a parser using these new parser combinators looks like:

    block = Parsimmon.succeed({}).chain(() => {
        var indent = Indent.get()
        return Parsimmon.seqMap(
            Indent.relative(Indent.gt).then(statement),
            (cr.then(Indent.relative(Indent.eq)).then(statement)).many(),
            (first, blk) => {
                blk.unshift(first)
                Indent.set(indent)
                return {'blk' : blk}
            }
        )
    })

This parses a block of statements, the first line of the block must be more indented than the previous line, and the remaining lines must be indented the same amount as the first line.

shelby3 commented 7 years ago

@keean I will catch up with you later on the parser combinator implementation. I haven't employed them ever, so I will need to dedicate some time to that. My first priority is to write the grammar into an EBNF file and check that it is conflict-free, LL(k), and hopefully also context-free. I read that parser combinators can't check those attributes.

Also I will want to understand whether using a monadic parser combinator library, forces our AST into a monadic structure and whether that is the ideal way for us to implement. Any way, you are rolling on implementation, so I don't want to discourage you at all. I will try to rally around one way and help code. I will need to study. My focus so far has been on nailing down the syntax and early design decisions. Btw, congrats on getting rolling so quickly on the implementation!

Btw, I hate semicolons. Any particular reason you feel you need to litter the code with them? There are only a very few ASI gotchas in JavaScript (and these I think can be checked with jslint) with not including semicolons and these are easy to memorize, such as not having the rest of the line blank after a return as this will return undefined.

Also I prefer the style of this latest code compared to what I saw before, because I don't like trying cram too many operations on one LOC. It makes it difficult to read the code IMO.

Also, I think I would prefer to employ arrow functions as follows (we'll be porting to self-hosted later so we'll have arrow functions as standard to any ES version and to compromise at 3 spaces indentation (even though I prefer 2 spaces lately):

block = Parsimmon.succeed({}).chain(() => {
   var indent = Indent.get()
   return Parsimmon.seqMap(
      Indent.relative(Indent.gt).then(statement),
      (cr.then(Indent.relative(Indent.eq)).then(statement)).many(),
      (first, blk) => {
         blk.unshift(first)
         Indent.set(indent)
         return {'blk' : blk}
      } 
  )
})

Also I would horizontally align as follows because I love pretty code, which is easier to read:

IndentationParser.prototype.eq  = {op: eq(x, y) => {return x == y}, err: ' does not equal '              }
IndentationParser.prototype.ge  = {op: ge(x, y) => {return x >= y}, err: ' is not equal or greater than '}
IndentationParser.prototype.gt  = {op: gt(x, y) => {return x >  y}, err: ' is not greater than '         }
IndentationParser.prototype.any = {op: gt(x, y) => {return true  }, err: ' cannot fail '                 }

I may prefer:

IndentationParser.prototype.eq  = { op: eq(x, y) => {return x == y},
                                   err: ' does not equal '              }
IndentationParser.prototype.ge  = { op: ge(x, y) => {return x >= y},
                                   err: ' is not equal or greater than '}
IndentationParser.prototype.gt  = { op: gt(x, y) => {return x >  y},
                                   err: ' is not greater than '         }
IndentationParser.prototype.any = { op: gt(x, y) => {return true  },
                                   err: ' cannot fail '                 }

Above you are implicitly making the argument again that we should have the ability to name inline functions (without let) in our programming language. Note this would be an alternative solution to the ugly syntax for the case where we need to specify the return type, but afaics we can't unify around (x, y) => x == y without the prefixed name unless we don't use parenthesis for anonymous product (tuple) types and remain LL(k). Any idea how ES6 is parsing their arrow functions? LR grammar? Alternatively you would be writing that in our language:

let eq(x, y) => x == y
let ge(x, y) => x >= y
let gt(x, y) => x >  y
let gt(x, y) => true
IndentationParser.prototype.eq  = {op: eq, err: ' does not equal '              }
IndentationParser.prototype.ge  = {op: ge, err: ' is not equal or greater than '}
IndentationParser.prototype.gt  = {op: gt, err: ' is not greater than '         }
IndentationParser.prototype.any = {op: gt, err: ' cannot fail '                 }

Which would have helped you catch the error on the duplication of the gt name copy+paste typo. The only reason you are adding the redundant naming above is for browser debugging stack traces correct?

Or (unless we change the syntax):

IndentationParser.prototype.eq  = { op: x y => x == y,
                                   err: ' does not equal '              }
IndentationParser.prototype.ge  = { op: x y => x >= y,
                                   err: ' is not equal or greater than '}
IndentationParser.prototype.gt  = { op: x y => x >  y,
                                   err: ' is not greater than '         }
IndentationParser.prototype.any = { op: x y => true,
                                   err: ' cannot fail '                 }
keean commented 7 years ago

The main reason to use function is backwards compatibility, not all browsers support => yet.

With regards to our syntax, function definition should be an expression, so you should be able to include it inline in the object declaration. I think we would end up with something like this:

data Relation = Relation { op : (A, A) : Bool, err : String }

let eq = Relation { op: eq(x, y) => x == y, err: ' does not equal ' }
shelby3 commented 7 years ago

@keean wrote:

The main reason to use function is backwards compatibility, not all browsers support => yet.

I know. That is why I wrote:

Also, I think I would prefer to employ arrow functions as follows (we'll be porting to self-hosted later so we'll have arrow functions as standard to any ES version

I had already explained we will get backwards compatibility for free, and by not putting function we are more compatible with the way it will be written in our language when we port over.

Who can't run our compiler in a modern browser in the meantime? This is only alpha.

Please re-read my prior comment, as I added much to the end of it.

keean commented 7 years ago

Regarding semi-colons, Douglas Crockford in "JavaScript: The Good Parts" recommends always using semi-colons explicitly because JavaScripts semi-colon insertion can result in the code not doing what you intended.

keean commented 7 years ago

I think you are right about '=>' for functions, as it is running in Node which supports them, however, I don't think porting will be that straightforward, as we won't directly support prototypes etc.

shelby3 commented 7 years ago

@keean wrote:

because JavaScripts semi-colon insertion can result in the code not doing what you intended.

Did you not read what I wrote?

There are only a very few ASI gotchas in JavaScript (and these I think can be checked with jslint) with not including semicolons and these are easy to memorize, such as not having the rest of the line blank after a return as this will returnundefined.

http://benalman.com/news/2013/01/advice-javascript-semicolon-haters/

keean commented 7 years ago

Regarding semi-colons:

... the specification is clear about this. JavaScript’s syntactic grammar specifies that semi-colons are required. Omitting semi-colons results in invalid JavaScript code. That code won’t throw (thanks to ASI), but it’s still invalid.

shelby3 commented 7 years ago

Semicolons won't help you here:

return
   some long shit;

You have to know the rules, whether you use semicolons or not. That is why I am happy we are going to use a Python style indenting.

Semicolons are training wheels that don't protect against every failure.

keean commented 7 years ago

Also jshint wants you to put them in, and I am using jshint as part of the build process.

jshint catches the above error :-)

shelby3 commented 7 years ago

JSHint can be configured to allow ASI. And I think it will still warn you about ambiguous implicit cases, if I am not mistaken (it should).

keean commented 7 years ago

without semi-colons JSHint cannot recognise the above error because you might mean:

return;
some long stuff

or

return some long stuff;
shelby3 commented 7 years ago

Bottom line is you have something at the start of the line which could possibly be a line continuation, then check to make sure you have made it unambiguous.

That is the simple golden rule and it applies whether using semicolons or not. That is not complicated. One simple rule.

keean commented 7 years ago

JavaScript was never designed to be used without semi-colons... lets design our new language not to require them, but I don't see any point in fighting JavaScript... We will emit the semi colons into JS :-)

shelby3 commented 7 years ago

@keean wrote:

without semi-colons JSHint cannot recognise the above error because you might mean:

It should be warning that the case is ambiguous. I can't be faulted for the JSHint programmers being derelict (if they are, did not confirm).

shelby3 commented 7 years ago

@keean wrote:

JavaScript was never designed to be used without semi-colons...

The intent isn't relevant. What is, is what is relevant. We need to know the rules whether we use semicolons or not. We are not stupid programmers who need to make ourselves feel we are more secure by not knowing the rules. I lost my training wheels 30 years ago.

Otherwise we need to find a linter that warns of all ambiguous cases with or without semicolons.

Bottom line is you have something at the start of the line which could possibly be a line continuation, then check to make sure you have made it unambiguous.

That is the simple golden rule and it applies whether using semicolons or not. That is not complicated. One simple rule.

If JSHint isn't doing that checking, then it is derelict. Need to find a better linter.

Wonder if Douglas Crockford ever considered that. Some influential people decide that semicolons every where is the prescribed way, then why the heck did JS offer ASI any way?

Perhaps he could have realized that the only sure way, is to have a linter which properly warns of every ambiguous case, whether using semicolons or not. Instead perhaps these talking heads influenced the JSHint people to not add proper checking for the ASI case? Sigh.

keean commented 7 years ago

So here's what the guy that created JS thinks: https://brendaneich.com/2012/04/the-infernal-semicolon/

shelby3 commented 7 years ago

It doesn't matter. It is just logic.

There you go. Cockford doesn't agree to support ASI in his tool and thus promulgates that ASI is an error:

Some argue that JSMin has a bug. Doug Crockford does not want to change JSMin, and that’s his choice.

That's right:

And (my point here), neither is newline.

Know the rules. Newline is not a statement nor expression terminator in JavaScript. Simple as that. Resolve all ambiguous cases.

Analogous superfluous redundancy as one wouldn't write ;;;;;;; at the end of every statement or expression to make sure they got it right. They also don't need to write ; to make sure they got it right, if they are using a linter which can warn them whether the preceding expression on the prior line could be joined to the next line and thus that a semicolon or other syntax needs to be inserted to resolve the ambiguity.

keean commented 7 years ago

The moral of this story: ASI is (formally speaking) a syntactic error correction procedure. If you start to code as if it were a universal significant-newline rule, you will get into trouble. A classic example from ECMA-262:

keean commented 7 years ago

So I don't write code with syntactic errors... I write Python without I write C++ with... it doesn't bother me, I go with what the language standard says...

shelby3 commented 7 years ago

The moral of this story: ASI is (formally speaking) a syntactic error correction procedure. If you start to code as if it were a universal significant-newline rule, you will get into trouble. A classic example from ECMA-262:

Then why did he put it in JavaScript. Linters should do their job correctly.

There is absolutely no reason you ever need a ; after a block { }. The } terminates the statement or expression.

keean commented 7 years ago

I wish I had made newlines more significant in JS back in those ten days in May, 1995. Then instead of ASI, we would be cursing the need to use infix operators at the ends of continued lines, or perhaps \ or brute-force parentheses, to force continuation onto a successive line. But that ship sailed almost 17 years ago.

shelby3 commented 7 years ago

I wish I had made newlines more significant in JS back in those ten days in May, 1995. Then instead of ASI, we would be cursing the need to use infix operators at the ends of continued lines, or perhaps \ or brute-force parentheses, to force continuation onto a successive line. But that ship sailed almost 17 years ago.

There you go. JS can't require semicolons. So why do you? Probably because we can't use a proper (non-derelict) linter, probably because JSHint probably doesn't warn of all ambiguities with 'ASI' enabled (but I didn't confirm that).

We are moving to block indenting to avoid this entire mess.

keean commented 7 years ago

Okay, so conclusion, I will use '=>' for anonymous functions, but leave the ';' in for now...

Our language won't require semi-colons, just like Python does not...

shelby3 commented 7 years ago

I go with what the language standard says...

The language standard says ASI is a supported feature. You have to know the rules whether you use semicolons or not. I will not repeat this again.

Let's try to find a linter which isn't brain-dead.

keean commented 7 years ago

My two cents: be careful not to use ASI as if it gave JS significant newlines. And please don’t abuse && and || where the mighty if statement serves better.

The standard says it is a syntax error to omit the semi-colon.

shelby3 commented 7 years ago

The standard says it is a syntax error to omit the semi-colon.

Then why does it compile. Brendan told you that JS can not require semicolons every where because it breaks other things.

keean commented 7 years ago

You are right. If you really can't work with the semi-colons, I will get rid of them for this project.

keean commented 7 years ago

Can I delete the semi-colon discussion, as its cluttering the implementation thread... I am going to remove them.

I discovered this does not work inside => defined functions... that is a bit weird.

shelby3 commented 7 years ago

I don't understand why deleting my disagreement helps. You have your freedom to do it your way and I have my freedom to speak my logical and engineering disagreement.

ASI is a feature of JavaScript that can't be removed for the reason Brendan (its creator) explained. Ostensibly Douglas Cockford and other (corrupt?) people in the standards process get together and decide that they are too lazy to write tools that fully support the language (including apparently JSHint's derelict failure to offer even a flag to report ambiguous cases of ASI?), so they decide to decree that ASI is a syntax error when in fact it is required by the language and does in fact not generate a syntax error. A programmer could accidentally insert an ASI case, and the language will not generate a syntax error. It is entirely derelict and a basterdization of the language. Brendan even agrees with me if you read carefully what he is saying. He has no choice but to go along with the community of derelictness, because it is a political game. Look what happened to Brendan recently because of his personal politics. Corrupt world we live in and I will not be a supporting member of the corruption by following illogical decrees and basterdization of what is. Note I can't accuse any person or organization of corruption, because I don't know that to be the case. Can just be human nature and design-by-committee outcomes. ASI is part of JavaScript, regardless of some meaningless words inserted after the fact by some standards committee. And tools that don't support the language fully are derelict.

I am a rebel. And I will remain one. But of course, I will do that is reasonable after registering my disagreement.

Look how successful the drive to not support ASI has been. The world's most popular language apparently doesn't even have an open source linter that is compliant with the language? I am an engineer, not a politician.

I register my disagreement with cow-tailing to derelict refusal to adhere to the features of the language. My point to them is remove ASI from the language if they don't want tools to fully support it. Instead of using deception and politics to influence a lack of support from tools.

Proofs by appeal to authority are less convincing to me than proofs of engineering facts.

keean commented 7 years ago

Doesn't JSHint work? Seems to work for me with 'ASI' enabled.

shelby3 commented 7 years ago

@keean wrote:

Doesn't JSHint work? Seems to work for me with 'ASI' enabled.

I don't know. Is it warning of all ambiguous cases with 'ASI' enabled? I haven't checked. I assumed not, because you were so resistant.

I just feel we are expert programmers and if our linter informs us of all ambiguous cases, then I see no logical reasons to be forced to insert semicolons. Perhaps one could make an argument that without a linter, then semicolons are safer and in a large organization maybe they make that decision. We are not a large organization. By the time we get to having many people contributing, hopefully we will have ported to self-hosted to get entirely away from this ASI issue.

I sympathize with Brendan, he has to politically navigate a language design with some flaws.

Now if JSHint is not working correctly with 'ASI' enabled in terms of warning of all gotchas, one can make a case that we should just acquiesce. Seems many others have. But what bothers me is the antagonists (protagonists for deprecating ASI) don't argue it that way ("hey we made derelict tools, so you have no choice" would be more honest). They use an incorrect logic which fools the readers.

Sorry I didn't have the energy to explain that earlier. I needed to go cook and eat. So now I have the energy to explain my side. Maybe I will acquiese if there is no good linter, but I will still register my frustration with basterdization by influential people. (note I am not knowledgeable about how it played out over the years, I am just guessing)

Of course I can work with semicolons if we must, I am just unhappy with decrees which are not correct from an engineering perspective.

shelby3 commented 7 years ago

@keean wrote:

So I don't write code with syntactic errors

ASI is not a syntax error in JavaScript, regardless of some decree. If it were a syntax error, it would refuse to compile and run. I know the distinction between 'error' and 'warning', regardless of some "deception" (for lack of a better word) in an official document.

SimonMeskens commented 7 years ago

I discovered this does not work inside => defined functions... that is a bit weird.

=> is not the same as function. => doesn't create a lexical scope, this works differently because of it, I could elaborate if you want, but googling fat arrow syntax for JavaScript should explain it.

shelby3 commented 7 years ago

I hope that is more fair. What I am saying is that if JSHint is not derelict, then we could (if we choose to) get rid of superfluous semicolons. But if we have no good linter that can check ASI ambiguities for us, then I guess we have to acquiesce and use superfluous semicolons per tools support. I am just a teeny bit miffed about appeals to authority, as the rationalization.

shelby3 commented 7 years ago

@SimonMeskens wrote:

I discovered this does not work inside => defined functions... that is a bit weird.

=> is not the same as function. => doesn't create a lexical scope, this works differently because of it, I could elaborate if you want, but googling fat arrow syntax for JavaScript should explain it.

Question is do we need also the distinction in our language?

I presume not having a lexical scope means all references constructed are accessible in the outer lexical scope. When do we need this? It seems to be the antithesis of what a function is supposed to be.

Thus maybe @keean should be using function and not fat arrow, if our language will always create a lexical scope for all functions.

Good point. Btw, I knew about the lack of this, but had not paid attention to the lack of a lexical scope.

Edit June 4, 2017: arrow functions have closures over lexical scope. The only difference is this is scoped lexically, which basically means they should not be used for methods that will be called with the . syntax.

SimonMeskens commented 7 years ago

If we keep it, we shouldn't keep the concept of this as it is in JavaScript today, that's for sure. JavaScript combines multiple concepts and gives them a confusing name (this in JavaScript is NOT self, like in about every other language featuring the this keyword, but it sorta acts like self in common cases, this is highly confusing).

With type classes, you don't need this at all I think, I vote we pull it out completely and prefer a system of type classes, with maybe an extension method (through partial application) operator. For example:

function x(a: A, b: B) {
  return a + b
}

All functions can be partially applied through some syntax (or maybe all functions get curried by default?). Extension method syntax would then simply be syntactic sugar for partial application.

Something like this:

x(a)(b)

is equivalent to:

a::x(b)

is equivalent to:

a::b::x()

Now, the first syntax is just regular old partial application through currying, the second syntax is just shorthand for partial application, but looks conveniently like classic OO and the third example uses that syntax in a more exotic fashion, that is mainly a curiosity, but might be useful since it allows you to write in some sort of postfix notation.

In reality, the third example would probably look more like this:

let x1 = a::x // Meaning, apply a to x as the first parameter

... some more code

let x2 = b::x1 // Apply b to x1 as the second parameter of x

... some more code

x2() // execute the pre-bound function

This system completely foregoes any need for function context or self and thus conveniently scraps the confusing this from the language.

shelby3 commented 7 years ago

@SimonMeskes wrote:

...we shouldn't keep the concept of this as it is in JavaScript today, that's for sure.

We are not.

With type classes, you don't need this at all I think

If we discard this, we can't call methods of the typeclass with familiar dot syntax. I proposed static for methods which are supposed to be independent of any instance selected the implementation.

prefer a system of type classes, with maybe an extension method (through partial application) operator

I don't comprehend how you think extension methods apply to typeclasses. In my mind, Extension methods are a different concept and not related.

shelby3 commented 7 years ago

@keean I don't think we need functions whose bodies are not lexically scoped. If you agree, ~I suggest we never emit ES5 fat arrow functions.~ See edit. Thus you should be using function as you were, to support compatible semantics.

keean commented 7 years ago

@shelby3 I agree, we emit function.

I think we can use . notation for scoping and modules, we don`t need it for type class functions and constructors.

When you have multiple dispatch on typeclasses, having one object in front of the function does seem necessary.

I think I prefer:

let x3 = add(x2, x1)

To

let x3 = x2.add(x1)

Because we would select the overloaded type-class function to run based on the type of both arguments, not just the first.

And we would still have modules so you might have:

let x3 = math.add(x2, x1)

because add is defined in the math module. However we should have flexible importing that lets you rename and bring a modules functions into scope without a prefix if wanted.

SimonMeskens commented 7 years ago

@shelby3 I showed you how partial application IS extension methods and how a prefix partial application syntax corresponds exactly to dot notation and how that allows us to completely remove this altogether, which part isn't clear, so I can elaborate? The only reason I picked the arbitrary :: as the partial application syntax, is because I don't think it's useful to overload dot notation even more, but it's completely equivalent.

shelby3 commented 7 years ago

@keean I don't think the most frequent case will be implementing typeclasses by pairs of data types. I think single data type implementations will be much more common, and Eq and Add will not work with pairs of types, but rather the same type for both arguments. We want to avoid implicit conversions, as that appears to be a design error in many languages such as C and JavaScript. Please correct me if you know or think otherwise.

As I had argued, it doesn't seem to make any sense to a programmer coming from OOP, that they've specified an interface bound on their function argument, and they are not suppose to call the methods using dot syntax. I think it is breaking the way mainstream (Java, C++, JavaScript, PHP, etc) people naturally think about it. I explained that static makes it explicit when this doesn't apply. I much prefer explicitness in this case, as it is not any more verbose.

I prefer to keep out typeclass concept as familiar as possible to the way people already think, especially given there are no downsides to doing so.

shelby3 commented 7 years ago

@SimonMeskens wrote:

I showed you how partial application IS extension methods and how a prefix partial application syntax corresponds exactly to dot notation and how that allows us to completely remove this altogether, which part isn't clear, so I can elaborate?

Oh yeah Kotlin and I realized too 5 years ago that extension methods are equivalent to partial application over a function passing the this, so of course I agree.

But I don't see how that is applicable to typeclasses?

This system completely foregoes any need for function context or self and thus conveniently scraps the confusing this from the language.

Afaik, there is not confusion in Java about this. Afaics, JavaScript's problems with this (being a prototype language, not strictly OOP) are not related to my proposal for using this for typeclasses.

SimonMeskens commented 7 years ago

If we discard this, we can't call methods of the typeclass with familiar dot syntax.

I was replying to that comment. Basically, this is how this (function context) works in JavaScript:

function x(this: A, b: B) {
  return this + b
}

a.x(b);

TypeScript lets you make this explicit, but this is just an extra parameter to the function containing the call-site variable.

What I'm saying is that this is equivalent to:

function x(a: A, b: B) {
  return a + b
}

a::x(b);

When we assume :: is partial application, right? So what if we make it so dot notation is partial application, like in C#? We replace :: with . and the code looks like this:

function x(a: A, b: B) {
  return a + b
}

a.x(b);

This is exactly the familiar dot notation. The only difference is that I say: if the function has a call-site, partially apply that call-site as the first parameter. What you say is: if the function has a call-site, store it in an implicit this variable. I'm just saying that the second example is, in my opinion, more clear, because it avoids using the word this for call-site. If you disagree with me, I highly recommend you change the name to context instead, so there's no confusion with this in Java or C#, where it means closure of self, a completely different concept, that confusingly, works similarly for common cases. This trips up lots of programmers today in JavaScript.

SimonMeskens commented 7 years ago

Also, here's an example of the same code in C#:

static x(this A a, B b) {
  return a + b;
}

a.x(b);

Basically, they add the this keyword in front of the first variable to make it explicit to the programmer that the first variable might be call-site and only allow you to call functions marked like this with dot notation.

shelby3 commented 7 years ago

@shelby3 wrote:

@keean I will catch up with you later on the parser combinator implementation. I haven't employed them ever, so I will need to dedicate some time to that. My first priority is to write the grammar into an EBNF file and check that it is conflict-free, LL(k), and hopefully also context-free. I read that parser combinators can't check those attributes.

I am working on the LL(k) EBNF grammar now, and I think you will have difficulty getting this correct with parser combinators, or at least not without the guidance of a checked EBNF grammar. I already found conflicts (ambiguities in the grammar) that would not have occurred to me without the SLK check. For example, we can't put a function definition declaration inside an if nor else expression. I will publish asap (hope today), so you can see what I am referring to and so you may offer your reaction and/or corrections.

shelby3 commented 7 years ago

refresh prior comment

keean commented 7 years ago

@shelby3 That's why I used the formal indentation parsing PEG from the paper I linked to. Ad-hoc approaches are likely to be error prone. With these combinators you can introduce a parser with an absolute indent, or a relative indent. A relative indent can be equal, greater-equal, or greater than the previous line's indent. This easily lets us define a correct parser for a statement block.

keean commented 7 years ago

@shelby3 So if we decide to keep dot notation, that stops us having type-class methods that do not take the 'base type' as an argument. For example lets say we want a type-class to declare the + operator. I would do something like:

typeclass Add<A>:
    (+): (x : A, y : A) : A

implement Add<Int>:
    (+) : (x, y) => primitive_add_int(x, y)
keean commented 7 years ago

I now have a compiler front-end that links the parser and generator. Here's an example of it in action:

Input file "t1.zs"

let id = id(x) => x
id(42)

Run command:

node src/compiler.js t1.zs

Output file "t1.js"

id=function id(x){return x;};id(42);

It is still using the earlier function syntax though.

It also dumps the AST into a file "t1.ast" for debugging:

{
  "status": true,
  "value": {
    "blk": [
      {
        "ass": "id",
        "exp": {
          "fn": "id",
          "args": [
            "x"
          ],
          "body": {
            "rtn": {
              "var": "x"
            }
          }
        }
      },
      {
        "app": "id",
        "args": [
          {
            "lit": 42
          }
        ]
      },
      {
        "eof": ""
      }
    ]
  }
}