florianschanda / miss_hit

MATLAB Independent, Small & Safe, High Integrity Tools - code formatter and more
GNU General Public License v3.0
160 stars 21 forks source link

Allow tokens which start with an underscore "_" #224

Closed quazgar closed 3 years ago

quazgar commented 3 years ago

What kind of feature is this?

MISS_HIT component affected

Describe the solution you'd like Currently the lexer fails when e.g. a function starts with _. I find this overly restrictive, since for example Python uses underscores to denote internal or private entities. I was planning to name my wrapped mex functions as e.g. _foo.mex. Currently this makes the lexer fail with a fatal error.

Probably this can be fixed around line 517 in m_lexer.py?

florianschanda commented 3 years ago

https://www.mathworks.com/help/matlab/ref/matlab.lang.makevalidname.html

I quote:

A valid MATLAB identifier is a character vector of alphanumerics (A–Z, a–z, 0–9) and underscores, such that the first character is a letter and the length of the character vector is less than or equal to namelengthmax.

Or https://www.mathworks.com/help/matlab/ref/isvarname.html

A valid variable name begins with a letter and contains not more than namelengthmax characters. Valid variable names can include letters, digits, and underscores. MATLAB keywords are not valid variable names. To determine if the input is a MATLAB keyword, use the iskeyword function.

Also, in MATLAB this doesn't work:

>> _x = 5
 _x = 5
 ↑
Error: The input character is not valid in MATLAB statements or expressions.

Does this perhaps work in Octave? As in MATLAB this is most definitely illegal, and so MISS_HIT is doing the right thing.

florianschanda commented 3 years ago

Indeed, this is an Octave ticket: https://octave.org/doc/v4.2.0/Variables.html#Variables

The name of a variable must be a sequence of letters, digits and underscores, but it may not begin with a digit.

Interesting difference :)

florianschanda commented 3 years ago

@quazgar this will be fixed in the next release. But please note that you need to enable octave mode in your miss_hit.cfg file:

octave: true

In MATLAB mode this construct will still be rejected with a lex error (as this is the correct behaviour).

quazgar commented 3 years ago

Thank you for the quick implementation and also for hinting me that this is an Octave feature! I will need to change my code anyways since I am planning to make my code compatible for both implementations ;-)

florianschanda commented 3 years ago

You may like to hear then that I am planning on a tool mh_compat that assesses code for matlab/octave compatibility. I am not quite ready yet to start implementing it but it's definitely planned.

quazgar commented 3 years ago

Great to hear, I will probably find it sooner or later.