KSP-KOS / KOS

Fully programmable autopilot mod for KSP. Originally By Nivekk
Other
697 stars 230 forks source link

Unclear Grammar for If Else #2517

Closed jonnyboyC closed 5 years ago

jonnyboyC commented 5 years ago

I was hoping for some clarification on what exactly is and is not allowed for the if else control flow. I'm attempting to match these rules as closely as possible my language server project and would like to avoid confusing any user by having it claim something is or is not allowed when it really isn't or is.

The documentation appears to indicate that these variants are allowed

  // normal
  if example {
      print("if" ). 
  } else {
      print("else").
  } 

  // trailing else period
  if example {
      print("if" ). 
  } else {
      print("else").
  }. 

  // trailing if period with no else branch
  if example {
      print("if").
  }.

Some testing shows these are also allowed.

  // instruction terminated by period
  if example print("if").
  else print("else").

  // mixed block and single instructions
  if example print("if").
  else {
      print("else").
  }

  if example {
       print("if").
  } else print("else").

Should there be more documentation for these other varieties or are not supported? When documentation pr are made what sort of testing needs to be done? just build the sphinx documentations and make sure it looks right. I was thinking I could help out with these small documentation things as I find them.

RCrockford commented 5 years ago

As I understand it you get a single statement after if or else. A single statement is either some line of code followed by a period, or a brace expression block. You should be able to mix and match them as you like (I think you have all the options there). Following a brace block with a period just inserts an unrelated empty statement after the block.

jonnyboyC commented 5 years ago

Another small oddity from testings is the following is allowed and will print else

  if false ..
  else print("else")..

I may suggest we update some of these control flow examples in the documentation to include non block instructions to indicate it's possible

Dunbaratu commented 5 years ago

Well that last one might be some weird bug. I'd like to see what the parser thinks that parse tree looks like, as that pair of periods definitely should be separating the else from the if and make an error.

But for the others, it's just a simple generic rule that has nothing to do with "if", and everything to do with this:

The actual definition of if is that it expects a single item as the body, like so (this is from the file src/kOS.Safe/Compilation/KS/kRisc.tpg):

if_stmt        -> IF expr instruction EOI? (ELSE instruction EOI?)?;

It's just that instruction has, as one of its allowed versions, the rule instruction_block, like so:

instruction_block -> CURLYOPEN instruction* CURLYCLOSE;
instruction -> empty_stmt |
               set_stmt | 
               if_stmt |
               until_stmt |

(snip) ... etc ... for every statement in the language - print, stage, run, etc .... (then at the bottom it does this:)

               instruction_block |
               identifier_led_stmt | // any statement that starts with an identifier.
               directive;

An instruction block (group of instructions inside a pair of braces) is itself also an instruction and can be placed anywhere a single instruction is expected.

So this isn't some weird unique thing about if and else. it's also how loops like until are defined:

until_stmt     -> UNTIL expr instruction EOI?;

It's defined as if the body is just one instruction, but "one instruction" can end up meaning "instruction block", which is a set of braces with stuff inside.

Or the ON statement - same thing:

on_stmt        -> ON varidentifier instruction EOI?;

So any clarification to that part of the documentation doesn't really belong in just the if/else part. Using brace sections as "a statement" is a common feature of all the grammar rules that have a bit that says "put a statement here".

The worry about newbie mistakes with brace-less bodies:

Anything describing this should discourage newbies from using the single statement style, though, because it's easy to not realize that the following example isn't really an if with a body of two statements. Rather, it's an if with one statement, followed by an unconditional statement:

// This example kerboscript code:

if x = 10
    print "Hello".
    print "World".

// Is actually doing this:
if x = 10
    print "Hello".
print "World". // This is actually outside the IF entirely,
               // executed unconditionally after the IF is done.
RCrockford commented 5 years ago

I would hazard a guess that the double period would be parsed as IF empty_stmt EOI. The grammar appears to allow it because if_stmt has the optional EOI after the controlled statement, yet all statements also include the EOI.

Dunbaratu commented 5 years ago

Sorry about re-editing the above so many times, probably while you were trying to read it. The github markdown wasn't obeying the three-backticks properly in what I typed until I finally figured out the secret unwritten rule that was messing up the rendering on what I typed. (Long story what that rule is, not worth going into.) It took a lot of re-tries to figure out why it was rendering what I wrote wrong.

jonnyboyC commented 5 years ago

@RCrockford that was my interpretation looking at the grammar file. It's what the parser I wrote evaluates the statements as. I mostly hoping for a confirmation since my last example was somewhat surprising that it was allowed.

@Dunbaratu I can understand not wanting to highlight the non-block statement version. for the exact reason you laid out. Would it make sense to include something in the language sections, or would you err on just avoiding having the documentation mention it? On somewhat of a tangent is the preferred term statement or instruction for non-expressions, just trying to keep the language as in consistent as possible for my project.

Dunbaratu commented 5 years ago

@jonnyboyC I would prefer the terminology for the grammar rules to be "statement" rather than "instruction", because "instruction" creates an ambiguity of meaning. (The kRISC opcodes that the compiler translates the script into are also called "instructions", as in config:IPU. A single statemement will typically get compiled into several instructions.)

The kRISC.tpg grammar file calls them "instruction" and it may be a good idea to do a renaming refactor that goes through and changes that to "statement". Doing so would also mean having to extend that renaming into the Compiler.cs code as well, as it uses identifier names that TinyPG built from the grammar rules.

jonnyboyC commented 5 years ago

Thanks for the clarifications, I'll go ahead and close this issue.