qdbplang / qdbp

MIT License
51 stars 1 forks source link

Period and readability #2

Open spoerri opened 1 year ago

spoerri commented 1 year ago

With the period syntax, it's a little hard to see what's getting executed...

Is it correct that, when I see a ".", I should scan back in the code to find the closest preceding capitalized token or "!" or operator, skipping over any intervening nesting blocks of [] or () or {}, and that's what the "." is actually causing to run?

x > y.                         ; . runs code of > of x
x > (lots and lots of code).   ; . runs code of > of x
x > f!.                        ; . runs code of ! of f

Is there a way to get something like the clarity of mainstream programming languages on this?

There being a second syntax to execute code (parentheses) is also a little confusing, and it doesn't feel as if one is just sugar for the other. Would you consider going entirely with the parentheses syntax, and jettisoning the period syntax? It's very lispy, and imho qdbp could do worse than to continue the lisp tradition. Presumably qdbp's support for variables would save it from parentheses hell. Perhaps the parentheses syntax could be made slightly more complicated/powerful to support things like (a + b + c + d)

dghosef commented 1 year ago

There being a second syntax to execute code (parentheses) is also a little confusing, and it doesn't feel as if one is just sugar for the other.

Having parentheses around method invocation automatically inserts a period so that (object Method.) can be written as (object Method). There's nothing really that special about parentheses except for the fact that it allows us to omit the period without introducing ambiguity.

Would you consider going entirely with the parentheses syntax, and jettisoning the period syntax?

I have considered it. The way I see it, there are three options:

  1. Parens indicate method call(either lisp style or C++ style)
  2. Period or other punctuation indicates method call
  3. Juxtaposition indicates method call(Haskell/Ocaml style)

I am not a fan of option number 3 because then we also need to introduce more precedence and associativity rules.

Options 1 and 2 both have sort of opposite pros and cons. Parens clutter up code, especially since almost everything(if/loops/switch/etc) is done through a method call, but they are excellent at making clear exactly how the grouping works. Periods are the opposite - they add almost no clutter but they require a little bit of mental gymnastics to figure out what is happening. From my experience, mixing the two creates the most readable code - you can be concise when grouping is obvious and verbose when necessary. This does give qdbp developers the ability to write really bad code with either too many parens or too many periods. Maybe some sort of linter is the solution?

Is it correct that, when I see a ".", I should scan back in the code to find the closest preceding capitalized token or "!" or operator, skipping over any intervening nesting blocks of [] or () or {}, and that's what the "." is actually causing to run?

That is more or less correct with the small caveat that you should also skip over intervening capitalized token/symbol and period pairs. For example,

object1 Method arg1: object2 Method..

gets grouped like:

(object1 Method arg1: (object2 Method.).)

The example you provided was correct

Presumably qdbp's support for variables would save it from parentheses hell.

I'm not sure it would. Lisps have variables and they still have parentheses hell.

spoerri commented 1 year ago

Some c-style-syntax languages also don't like complicated precedence, and force use of parentheses. E.g. a + b c won't compile (though of course a b + c also won't). Could you do something similar by default, and also somehow allow the method definition to customize precedence for the sake of extensibility? For example, if somebody is implementing their own control flow design using prototype methods, they'll know how its precedence should work, and not want to rely on users to place periods properly.

Lisps have variables

Yes, I guess I was hoping that more idiomatic variables would make a difference, but it's just hope. :)

dghosef commented 1 year ago

Sorry for the late reply.

I think my problem with that is the complexity it introduces. From the implementation side, the language will no longer be context free. And I think when people start using libraries that have a bunch of custom precedences, they will have a bunch to learn. What if two libraries define different precedences for the same operator?

The idea of having operators but force users to put parents for ambiguous expressions is probably how I'd do it if I were to add them. But I feel like it would just lead to the parentheses problem.

spoerri commented 1 year ago

Not at all. OK, brainstorming: how about by default, Methods are always invoked, without parens and without period.

Then parens could just be used for grouping when right associativity isn't enough.

If I've got it right, the code in the Tutorial/Overview would be fine without any periods except for three cases:

four := (prototype Method3) Method2 arg1: 4

safe_divide := {a b|
  (b = 0)
    True? [#Error{}]
    False? [#Ok a / b]
}

intlist := (empty_list Append 3) Append 2

You still have to scan back, but at least there's only one syntax to control evaluation order instead of two. And of course, it doesn't do anything to help people who are surprised by the evaluation order of a * b + c.

dghosef commented 1 year ago

That actually does look a lot cleaner.

I am playing with the parser right now, and I can't seem to find a way to make a grammar like that unambiguous. But let me play with it a bit more and get back to you.

You are right though, the arithmetic case now can become confusing. To be honest, I'm not sure there is a good solution to this that checks every box(if there was, it probably would have been found by now).

spoerri commented 1 year ago

Maybe worth a separate github issue or two, but i can think of a some things that would help if you're committed to supporting operator overloading, and willing to stomach a little extra complexity...

  1. As mentioned above, simple lack of precedence would be surprising for arithmetic. Suggestion: don't allow unparenthesized operators, if not via the grammar, then as a separate check. Could make an exception for multiple instances of the same operator, e.g. a + b + c shouldn't have to be parenthesized...
  2. If you do make an exception for multiple instances of the same operator, you'll notice that division and subtraction really want to be left associative. E.g. It would be very surprising for 10 - 6 - 3 to be equal to 7. It would kind of suck to special case them... Suggestion: make all operators left associative
  3. Even with that exception, there'd still be a a lot parentheses required... Suggestion: Support just three different precedences, for categories of operators that would not otherwise reasonably type-check. From low to high:

    • logical operators - lower precedence than the others - their outputs and inputs are booleans
    • comparison operators - next precedence - their outputs are booleans, not their inputs
    • everything else - highest precedence - neither their outputs nor their inputs are booleans

    E.g. a && b + c < d only makes sense as a && ((b + c) < d), at least without coercions or "truthiness".

(All the above is for methody operators. () [] {} : @ # ? would all be unaffected.)

Though the above logic obviously doesn't match the high standard of simplicity that you have set for qdbp so far, imho it'd be a small price to pay for syntax that so closely matches people's intuition for normal notation. Most people wouldn't even need to read the documentation. 😄