Parsing cast expressions assumes expression to be atomic leaf expression

mhasel commented 7 months ago

Describe the bug Currently, parsing something like INT#(a + b) or INT#UINT#16#FFFF is not possible, since our parser will always call parse_atomic_leaf_expression right after matching a TypeCastPrefix token. A possible fix to this would be to change cast-expressions to be parsed as unary expressions.

To Reproduce

FUNCTION main : DINT
VAR
    x : INT := INT#UINT#16#FFFF;
    y : SINT := SINT#(x + 20);
END_VAR
END_FUNCTION

mhasel commented 7 months ago

I've just double checked the standard and noticed the INT#... syntax isn't really meant as a type-cast but rather a literal denominator. I guess I got this mixed up due to how our lexer token is named/how this is represented in the AST. Therefore, supporting this isn't really required when strictly following the standard. I still see this as a nice-to-have extension to the language, however.

riederm commented 7 months ago

Shall we rename the token/ast representation?

mhasel commented 7 months ago

That might be sensible. I've also tried the following example:

{external}
FUNCTION printf : DINT
VAR_INPUT {ref}
  format : STRING;
END_VAR
VAR_INPUT
  args : ...;
END_VAR
END_FUNCTION

FUNCTION main : DINT
VAR 
    a : DINT;
    b : DINT := 65535;
END_VAR
    a := INT#b;

    IF a = -1 THEN
        printf('explicit downcast'); 
    END_IF;

    IF a = 65535 THEN
        printf('INT# on a reference does nothing');
    END_IF;
END_FUNCTION

The relevant lines in IR:

%load_ = load i32, i32* %b, align 4
  store i32 %load_, i32* %a, align 4

No truncation is happening here, it's a straight copy.

Here, the INT# prefix does nothing. So we should probably validate against this. Another option would be to just lean into the alternative cast-syntax use and allow it, but then we'd actually have to make it work the same as cast-statements

PLC-lang / rusty

Parsing cast expressions assumes expression to be atomic leaf expression #1151