PLC-lang / rusty

Structured Text Parser and LLVM Frontend
GNU Lesser General Public License v3.0
181 stars 47 forks source link

Missing size-mismatch validation for strings passed by VAR_INPUT #1210

Open volsa opened 3 weeks ago

volsa commented 3 weeks ago

Is your feature request related to a problem? Please describe. We're currently missing a validation for e.g. a STRING[10] being passed into a VAR_INPUT block with a variable of type STRING[5], essentially truncating the bigger string. While this is safe (at least in this branch https://github.com/PLC-lang/rusty/pull/1196) a warning should probably be emitted. For example:

FUNCTION main
    VAR
        x_str : STRING[10];
        x_arr : ARRAY[1..10] OF DINT;
    END_VAR

    foo(x_str); // This will not return any errors
    bar(x_arr); // This will return an `Invalid assignment: cannot assign 'ARRAY[1..10] OF DINT' to 'ARRAY[1..5] OF DINT'` error
END_FUNCTION

FUNCTION foo
    VAR_INPUT
        y_str : STRING[5];
    END_VAR
END_FUNCTION

FUNCTION bar
    VAR_INPUT
        y_arr : ARRAY[1..5] OF DINT;
    END_VAR
END_FUNCTION
mhasel commented 3 weeks ago

While I agree with this to some extend, this behaviour is partially due to the default string size.

If we take your example from above but change it like so:

FUNCTION main
    VAR
        x_str : STRING := 'hi';
    END_VAR

    foo(x_str); // This will not return any errors
END_FUNCTION

FUNCTION foo
    VAR_INPUT
        y_str : STRING[5];
    END_VAR
END_FUNCTION

Now the string won't truncate (not quite right - it will truncate, but only 0s), but considering its size is 80, it would still emit a warning. We could check only user-type strings, but that's inconsistent aswell. This might be a candidate for runtime-sanitization. #1062

Edit: Another option would be to use fat-pointers for strings, similar to how we implemented VLA's.

volsa commented 3 weeks ago

I guess for user-defined Strings the validation would make sense but you're right about the inconsistency. Whats the best approach then, delegate the task to the runtime-sanitizer altogether?

Another option would be to use fat-pointers for strings, similar to how we implemented VLA's.

Sounds like a big-ish refactor, not sure we want to do that if there are no other benefits other than the validation?

mhasel commented 3 weeks ago

Sounds like a big-ish refactor, not sure we want to do that if there are no other benefits other than the validation?

Yes, it would likely involve a large refactor, but we might be able to get rid of user-type strings altogether. We could also re-evaluate our current implementation of VLAs and change them to be by-value aswell - then strings could probably reuse a lot of that logic. Another approach I didn't mention previously would be to treat strings as generics and inline the appropriate signatures automatically.

Edit: This would likely only be feasible for functions. I can't think of of a way to do this for stateful POUs