curimit / SugarCpp

SugarCpp is a language which can compile to C++11.
135 stars 13 forks source link

Alternative function definition syntax (Haskellish, C++11ish) #37

Open ozra opened 9 years ago

ozra commented 9 years ago

I find the function declaration style of return value last, after an arrow, very intuitive, descibing the flow clearly. In C++11 the style was adopted too. Albeit, the required auto keyword stinks. It also follows the pattern of identifier : type closer. If breaking current syntax is unwanted, both could be supported, with the natural possibility ambiguity problems, if not now, maybe in the future.

some-func(a :int, b :int) -> int = a + b

other-func(a :int, b :int) -> int
    if a is b
        return a
    else
        do-evil-non-functional-crash()

Lambdas are ofcourse the same, but without a name, they should also be able to take the "catching/closuring syntax" as in C++11 (see own issue).

lamb := [=, &foo]() -> int
    groink = false
    foo = 47
    return 13
dobkeratops commented 9 years ago

similar to Rust ... this is nice; and makes the return value stand out more than in scala Rust uses a keyword 'fn' in front, which might seem extraneous - but it makes their source extremely easy to search with grep (fn & the name are always on the same line whilst the function name & -> might not be), and it works well for local functions too.

In Rust omitting the return value means no return value. In my own experimental language, i'm just inferring it if you don't write it.

ozra commented 9 years ago

Yeah, I really don't like the completely extraneous "fn". I've thought further on the syntax also;

While

identifier(some-arg : some-type...) -> return-type
    body

has a natural flow, another ordering could probably be more beneficial for clarity in the big picure:

identifier(some-arg : some-type...) return-type ->
    body

Why? Well, this turns the arrow into a sort of "block begins here", which simplifies one line lambdas/closures/functions/methods.

identifier(some-arg : some-type...) return-type -> body-on-same-row

vs

identifier(some-arg : some-type...) -> return-type => body-on-same-row

(LS notation, "=>" means "begin block", when expressions are put on the same line), or:

identifier(some-arg : some-type...) -> return-type = body-on-same-row

Drawing from current SugarCpp-notation, or:

identifier(some-arg : some-type...) -> return-type { body-on-same-row }

With voluntary block braces notation, which I really think is a good thing (TM)

I like the idea of making function writing as clean and simple as possible, promoting small functions in the code. The compilers inline very well anyway, so it's a win win - readability, understandability and performance. But then again, as above, maybe two braces won't kill anyone ;)

I like the idea that you don't make it non returning in your lang. However I think return type is so important that it should always be typed out - the intent is clearer. If no return value, void should specifically be used. Thereby also promoting functionalistic coding and value passing (which also is fucking fast, since stack is most always in cache, and thereby often faster than even member variables - so it's not only a safer, clearer, programming style, it's often faster too. ) - still ofcourse, without forcing it. Arguments on the other hand are perfect candidates for automatic infering, imo.

dobkeratops commented 9 years ago

I'll see where I go with that.. for the minute I'm trying to appeal to anyone between the C++ & Rust communities so sticking with their syntax is good - but there's a lot of nice ideas here and of course the C++ community is bigger. Go and Swift both have a preceding 'func' keyword. I'd probably err on the side of starting with a little redundancy and trimming it later. The preceding keyword seems to make the parser very simple, but I realise the "->" token must be able to mark it too. you can also setup really nice syntax highlighting easily in many editors since a regex can easily find declarations.

I really like the single-expression functions and "where" sugar from haskell e.g. fn foo(x,y)=expr where {...vars..} .. I'd definitely like to absorb that.

Given the 'fn' keyword being there I'd wondered about leveraging it a bit more e.g. maybe distinguishing fn and proc for pure vs side effects? safe vs unsafe? Rust writes unsafe fn .... extern fn - I could just skip the 'fn' when there's another qualifier, and assume its' a function.

in my 'class declarations' I already parse this

struct  Foo{
    fn foo(){  ...}  // member function
    virtual bar() { .... }  // virtual member function (no need to write 'fn' keyword aswell')
}                   // override would be another obvious candidate for a preceding qualifier

I also use fn foo(f:fn(int)->void) vs fn foo(f:(int)->void) to distinguish raw function pointers from closures (_func,_env) ... the former is still needed for interfacing with 'C' libraries..

However I think return type is so important that it should always be typed out - the intent is clearer.

Mostly agree- and this is where Rusts' 2-way inference is so good since you leverage types written in the signature; (perhaps in C++ you could get some of the same benefit by assuming an auto x; with no inferring expression just takes the return type.. I think D has dectyle(return).

The type inference will quickly warn you if a function has different exit points with return expressions that don't match.

I wanted a way of inferring returns, since they can get complex with iterators & maths (dimension checking? lazy expression templates? ... ) . In rust 'templates' are always bounded, so they need to start writing complex type expressions which can become more complex than the actual code you're writing... - this is a big reason I'm making this hybrid, wanting a little of both.

Something I also run into in Rust is refactoring.. if you make full use of the inference in a function body, if you then pull a piece of code out, its hard to figure out what the types actually were.

curimit commented 9 years ago

Since C++14 provide "Function return type deduction" which is the syntax decltype(auto), we can omit the type of return value now, function declaration could be nicer.

// return type deduction
foo() = 1

// specify the return type
foo() : int = 1

// specify the return type in lambda
lamb := [=, &foo]() : int
    groink = false
    foo = 47
    return 13
ozra commented 9 years ago

Given the 'fn' keyword being there i'd wondered about leveraging it a bit more e.g. maybe distinguishing 'fn' and 'proc' for pure vs side effects? safe vs unsafe? Rust writes unsafe fn .... extern fn - I could just skip the 'fn' when there's another qualifier, and assume its' a function.

I'm glad you mention this, I've been thinking about the same thing. First I thought like "pure (int, int) -> int {}" - but then it occured to me that one should really promote functional coding, therefor it would be best if funcs where purely functional (all args const in that case, no globals allowed, etc side effects!) and that procedural functions require additional notation. Then I started spacing out on whether several levels of safety could be contracted (allow member mutation, but no other external datas, including param mutation, allow everything, allow only returning value, etc.) Just throwing balls in the air here..

Since C++14 provide "Function return type deduction" which is the syntax

decltype(auto), we can omit the type of return value now, function declaration could be nicer.

I'm a bit vary on targeting standards post C++11, since support ain't that widespread yet in compilers (?).

Is it still possible to explicitly state it in SC++?

2014-12-10 13:11 GMT+01:00 curimit notifications@github.com:

Since C++14 provide "Function return type deduction" which is the syntax decltype(auto), we can omit the type of return value now, function declaration could be nicer.

foo() = 1 foo() : int = 1

— Reply to this email directly or view it on GitHub https://github.com/curimit/SugarCpp/issues/37#issuecomment-66443017.

curimit commented 9 years ago

I'm a bit vary on targeting standards post C++11, since support ain't that widespread yet in compilers (?).

Yes, see the second example, since my boss would never allows me to use sugarcpp, it is possible to generate code which fits C++, C++11 or C++14, by only use a subset of language features.

// specify the return type
foo() : int = 1

which compiles to

int foo() {
  return 1;
}

In this project, I would not try to do type inference, although we can use clang as our c++ parsing library, this will let the transpiler become not that light weight and user friendly (for example set library path?) .

The preceding keyword func foo() or let x = 1, won't make the parser become easier that much in my opinion. I think the reason why so many static typing programming languages introduce that "useless" symbol before function declaration or variable declaration is because your editor will highlight the func or let keyword, and that will become very clear for you to locate the position where the variable been defined. Maybe I will also introduce this keyword. (in pattern matching, to distinguish (var c, var d) = (x, y) and (x, y) = (y, x))

Due to live-script is a scripting language, it is reasonable to use a variable without declare it.

dobkeratops commented 9 years ago

yes - after using rust for a while ii'm keen on the 'func'/'fn' or 'let' keywords: the benefit is code is so easy to search. in rust, grep "(fn|struct|enum|type|trait)" finds definitions ... I've setup emacs to just grep like that when you hit 'F12' .. for a new language without an IDE it's very useful, and once you have an IDE, the IDE can insert the "useless keyword" for you easily. trailing "->" is a bit harder to search for because it might not be on the same line. Rust also allows local functions which are really nice IMO, again seeing 'fn' there makes the context jump out clearly

The other reason is to make it much clearer that you're defining something new. Of course thats' also possible with the more concise := , which is also a good solution.
I think this difference is just down to the use cases of static vs dynamic typed languages.. the former tends to be for more solid 'systems' whilst the latter more 'adhoc', (of course it is a sliding scale , many shades of grey)

ozra commented 9 years ago

Too me, the -> is the most natural and clear indicator of a function there is. Something "turns into" something else. (a, b) -> c. That's why I'd like to see all callable definitions have that signature. And out of the "search in source" argument perspective, I can't see how seeking for:

foo(a: int, b: int) -> int
    return a + b

would be any harder than

fn foo(a: int, b: int) : int
    return a + b

It will not be confused with pattern matching, tuple returning, or whatever feature.

Ofcourse, I agree that the auto return type, body-on-same-line version would look a bit funny, again:

foo(a) -> = a + 1

But mainly, I think the () -> means "some callable definition" is clear, concise and consistent.

For the search scenario: That it might not be on the same line is a moot point - in significant space it has to be, and was it not, you could as easily put fn on its own line too - no difference, right? Besides, adding [\w\n]* in the regular expression is piece of cake - depending on ml-limitations in the particular regex.

Rust also allows local functions which are really nice IMO, again seeing 'fn' there makes the context jump out clearly

Local functions would rock, it's something I really miss in C/C++, it's a good feature (TM), should be enabled in where clause in SugarCpp? However, also here, the magic arrow makes it very clear to me. And the closures of course would be defined within functions, not within the where block, thereby clearly separating closures from "pure" (well, non closured) local functions.

And as to the "solid" vs "adhoc" argument. That was the most outrageous piece of bullshit I've ever heard, haha. Sorry. No foundation in theory or practice at all.

dobkeratops commented 9 years ago

And out of the "search in source" argument perspective, I can't see how seeking for:

its' because the -> might be on another line; multi-line function declarations are very common.

I realise haskell has a way of splitting the parameter types & parameter names across 2 lines, thats' an interesting solution to this problem.

I've gone with Rust syntax initially because (i) its so easy to parse, and (ii)building on Rust syntax gives me a shot at interoperating with another community. Of all the potential 'C++ replacements' I think Rust has a good shot at becoming popular. Its the first language other than C++ I've actually wanted to use. Right now I'm trying to avoid spending time on just changing syntax, so have copied the most suitable existing one, and made simple additions. rust syntax has an easy way of disambiguating template params: After its' seen the 'fn' it knows for sure the next <...> is type-params. if you just wrote functionname<T,Y>(..)->... it has to parse ahead more before it can disambiguate the '<'. This IS possible,but its more complex to write. I've had to cut corners to get anywhere and pick the simplest option. if (src.eat_tok()==FN) return parse_fn(src) /* switches into function-declaration parsing context*/... done.

I can however see jonathan blows' language is going in the ()-> direction, as that project gains momentum perhaps I'll switch to its' syntax. Rust is still the bigger community.

but he doesn't have templates yet.

Ideally I would have 2 (or more..) syntax front-ends in the same compiler. Right now I just have to prioritise my own time.. I have to fix all the bugs & corner cases & cleanup what I've done if its' ever going to be useful to anyone (or even useful to me heh).

It should be easy for any potential collaborator to write an alternate parser because its' one discrete stage- but if the rest of my code is a buggy mess they'll never want to do that. It should also be possible to retrofit 'significant whitespace' just to the lexer, the parser wouldn't even need to know about it.

In theory everything I want could be a feature addition to Rust, its' just at present that community hates overloading.. which is a great shame .. but as it grows, maybe they will cave in to the preferences of more users? Their objections are down to things which could be controlled in other ways.

so 'fn' isn't permanent for me, its just a practical choice at the minute which keeps some interesting options. I've got pattern matching now so i'm a bit closer to maybe even getting Rust actually parsed. ... another avenue here would be to get collaborators on making a full Rust implementation that does actually have my preferred additions. I don't think I can do that single handedly but I can prove the concept.

-> is certainly a consistent way of saying 'its a function', I do like it. Rust however has a consistent form for declaring anything. keyword . The same for struct, trait, type, static, fn, enum, mod. This makes the whole parser easy to write. Then there's syntax highting,many editors just have a regex. its' easy there to make all definitions bold.

And as to the "solid" vs "adhoc" argument. That was the most outrageous piece of bullshit I've ever heard, haha. Sorry. No foundation in theory or practice at all.

maybe the context of this comment or my wording isn't clear;

I observe that dynamic typing is popular in some circles, there must be reasons people choose it in some situations. python is a very popular language.

Wheras for larger programs, people prefer static typing, its slightly more effort up front but catches so many errors, and makes refactoring much easier. C++, Java, ...

So, what 'in theory' or 'in practice' is the difference?

how would you express that?

I chose the words 'solid' vs 'adhoc' to describe the contrast. solid as in a large system with many pieces that need to fit together well, and many problems when pieces change so the extra compile time check of types becomes more helpful. 'adhoc' as in thrown together quickly, less time spent naming things..

I'm after a bit of both - I really like the fact C++14 has added the ability to infer the return type. For code returning lazy expression templates... the return type signatures can be complex. I prefer choice to dogma. Its' usually better to know the return type. Overall I need 'solid', but I'd like the 'adhoc' style when experimenting, writing tests.. I like being able to get things working before I spend time planning and naming. Also mathematical formulas just look natural without types. fn lerp(a,b,f)=(b-a)*f+a. The rust guys insist that needs all sorts of trait bounds to make sense heh. I know i'm not going to linearly interpolate a string.

inferred return types looks like they might be interesting for constructors with 'where sugar', because the type is still clear from the first line eg..

fn foo() =  StructName { 
    ...fields with assignment expressions
}
where {
    ... some more locals that give context to the struct-field assignments... 
}

best of both worlds. the return type is clear from the first line, and I haven't had to repeat it .. and there's less nesting levels.

ozra commented 9 years ago

I realise haskell has a way of splitting the parameter types & parameter names across 2 lines, thats' an interesting solution to this problem.

Yeah, basically, you define the type signature, then you define patterns and their corresponding expressions.

I've gone with Rust syntax initially because (i) its so easy to parse, and (ii)building on Rust syntax gives me a shot at interoperating with another community. Of all the potential 'C++ replacements' I think Rust has a good shot at becoming popular. Its the first language other than C++ I've actually wanted to use.

Deffo. Most people don't want to shake their foundations, so similarity is important, and also of course a following. There are lots of things I like about Rust. I guess that is the reason why I'm stuck in "compile to and with C++"-mode is because of the projects I will be with the forseeable future work out the best with the plethora of code and libs already available (I need inlining of most things, because the code needs to milk out every cycle of the machine park), and once again, because where there's C++ there's other developers. A bit same as the Rust road, but, I know, a bit stoneage legacy unfortunately..

I can however see jonathan blows' language is going in the ()-> direction, as that project gains momentum perhaps I'll switch to its' syntax. Rust is still the bigger community.

I've seen you mention this guy a few times, made me curious, so I finally goog'd it. Unfortunately I couldn't find anything, except youtube videos, which is not the way I approach information. The site (lerp.org) seemed dead. Do you have any links to the source / download / docs / whatever?

After its' seen the 'fn' it knows for sure the next <...> is type-params. if you just wrote

Yeah, I know what you mean, it's easy to have lots of ideas while not actually coding ;) I've made my corner cuttings in my days.

so 'fn' isn't permanent for me, its just a practical choice at the minute which keeps some interesting

I can see the path your taking more cleraly now. And, it is nice to see that you also see these lexical parts as more of surface/superficial parts of the language. There's nothing stopping from, as you say, making several lexers, a' la Haskells' 'layout'. I think it is good to allow as much variation to developers as possible while keeping interoperability 100%. It's my opinion that such superficials as how to indent, braces or significant space should be a team / developer decision for a project and not something artificially locked in when it doesn't have to be. I definitely see that as the future of coding. As long as the fundamentals of the language are the same, and it's just the lull lull that differs. Not everyone where's the same kind of clothes, but they do serve the same purpose.

maybe the context of this comment or my wording isn't clear;

Wow, yeah, sorry - I totally misunderstood what was talked about. Agree 100% on the dynamic vs static - adhoc vs solid. I've coded LS (JS) for serveral years now professionally, before now being back in C++ projects. And while I enjoyed it to some extent (you can to some fun stuff in LS, and it was super terse for asynchronous programming with its back calls [there weren't more than a few lines of code without there being a async op - so basically the stack was upside down, always a callback chain to debug - headache, ay ay), I always missed types. I mean dynamic is fine for proof of concept and web-widgets. But boy did I miss types when coding the core server parts over io.js (node.js). It was a very advanced system and should never have been developed in JS imo. But you know the business world: Buzz drives the train. As a parantheses, I view JS much as the 'byte code' of the internet. I coded in straight JS for only weeks. Now a Livescript-TypeScript crossover would be the shit for that net coding. Well, EOOT (End Of Off Topic) ;-)

Well, so in a new language, I would of course really like the ability to have most everything inferred, for whipping things together, but then it's always good to tighten things up. I think Rusts way of having untrusted keyword (if I remember correctly) for ie shared mem handling is a really good concept which should be expanded upon. Like strict in JS, etc. That one can define differently high demands on functions / blocks / units / whatever. For instance, what we talked about earlier - pure functions vs side effecting functions / procedures.

Also mathematical formulas just look natural without types. fn lerp(a,b,f)=(b-a)*f+a.

Yeah, deffo! I have lots of code now where parts are hardcore ram-fiddling to keep CPU-caches happy with huge amounts of sequential data, and then the actual analysis parts, which are coded by more people and are almost purely methematical in nature - it really sucks to bog that down with type crust, while in the bit-fiddling parts, which are somewhat "black-boxed" to the formula devers, it is a must. And that's why I'm so in love with the 'almost math notation' vibe of lerp(a,b,f) -> (b-a)*f+a :-) - I'm keeping the lid on that now, ok, haha.

Btw, great efforts - keep it up!

ozra commented 9 years ago

@curimit - sorry if there is a bit much of "general language discussion" here that is off topic from SugarCpp. It just kicks off the inspiration you know :)

dobkeratops commented 9 years ago

[r.e. Jonathan Blow] Do you have any links to the source / download / docs / whatever?

he hasn't published any code yet - he's keeping his project personally curated before he opens it up. The information is in his videos ; and I think that contributes to his chance of building a community. He articulates his reasoning well. His project was the inspiration for me to get started. I do recommend watching his videos. He's very keen on 'compile time function evaluation' and refactorability. he's expressed concern about the random nature of just opening something up on github, which is a shame, but I think he'll get a following when he does release it.