klauer / blark

Beckhoff TwinCAT ST (IEC 61131-3) code parsing in Python using Lark (Earley)
https://klauer.github.io/blark/
GNU General Public License v2.0
42 stars 5 forks source link

`POINTER TO STRING(c_Size - 1);` Not Parsing Correctly #70

Closed engineerjoe440 closed 1 year ago

engineerjoe440 commented 1 year ago

Came across this "in the wild" today:

pt_str : POINTER TO STRING(g_c_lenTimestampStr + 3);

Seems that blark doesn't really care for the parenthesis:

lark.exceptions.UnexpectedCharacters: No terminal matches '(' in the current parser context, at line 19 col 31

At a high-level, I think this should be fairly reasonable to iron out. I'll see if I can jump in and author a fix.

Working branch is here: https://github.com/engineerjoe440/blark/tree/bugfix/pointer-to-string-with-variable-not-parsing

engineerjoe440 commented 1 year ago

Aha! Seems to be less about the POINTER TO, and everything to do with the use of a constant (and some basic arithmetic) inside the parenthesis. image

klauer commented 1 year ago

Yeah, this might be a bit of a painful one to fix - we'll see... This comes from a lazy implementation here that didn't take into account the potential for arithmetic, as you found:

https://github.com/klauer/blark/blob/04c98d554be58e26429b20e9de877c60f5cdca0e/blark/iec.lark#L523-L524

engineerjoe440 commented 1 year ago

eh! Painful or not, it'll be a good improvement, I think.

(and I don't think you should call it a "lazy" implementation... it was the right thing to do given initially, and now it can grow! 😄 Nothing wrong with that! )

engineerjoe440 commented 1 year ago

As I'm working through this, I've come up with a general concept, but would love any additional input or insight you might have.

I think that the STRING_SPEC_LENGTH needs to be changed from a Terminal rule to a Grammar rule; something like the following: image

Now, I'm wondering if there's a way to make a single transformer that would alter itself to use parenthesis or brackets, depending on the "string_length_parenthesized" or "string_length_bracketized" formation. It seems like this is something I've seen done in blark, elsewhere, but now I'm trying to find if there's an example that does this.

Any thoughts, @klauer? Or might I just be overthinking this?

klauer commented 1 year ago

You're right in that the terminal needs to become a rule. There are some other implications there because STRING_SPEC_LENGTH is used elsewhere, too. STRING_TYPE needs to become a rule, for one.

I understand that your screenshot above is just a first pass covering only the case of (e.g.) const + 1 - but I think there needs to be a subset of expression that fits in for the spec length value. Presumably const * 2 + 1 and such would be grammatically acceptable (but would things like modulo/division/logical operators... ? Certainly not the assignment operator, at least we can say that for certain). This makes things even more complicated!

As to your specific question, you would definitely have to separate them into separate rules in order to capture either parentheses or brackets. A single rule wouldn't work because to support both as in /([/ length /)]/ it would mean unbalanced parentheses could be valid, e.g., STRING(10] or STRING[10).

(A bit of maybe unnecessary additional background - my original thought process was that blark could just be opinionated and say that STRING[10] and STRING(10) are equivalent so it'll collapse them into a single representation and then just choose to output one or the other. My opinion on that has since changed, and I think blark should take note of and retain that which the user originally specified.)

Anyway, this is definitely a tough one! Maybe the toughest so far of the bugs you've found.

klauer commented 1 year ago

I'm thinking of jumping back to this issue soon. Were you still poking around with it, @engineerjoe440? Don't want to step on your toes here

engineerjoe440 commented 1 year ago

Not stepping on toes at all! 👣 I've had to take a "recess" for a bit (just keep getting caught up with other things 😆). I don't really think I've made much progress on things. Here's the working branch I was experimenting on: https://github.com/engineerjoe440/blark/tree/bugfix/string-with-arithmetic-size

Honestly, it may not be all that useful, but may be worth a quick glance.

I'll be out of the office most of next week, but may have some availability. Please feel free to reach out, but I may be slow to respond.