foo123 / ace-grammar

Transform a JSON grammar into a syntax-highlight parser for ACE Editor
https://foo123.github.io/examples/ace-grammar/
36 stars 8 forks source link

How would I use this to verify variables match up with data types? #6

Closed ngreco closed 8 years ago

ngreco commented 8 years ago

My use case is as such. Given a json schema of some (static, pre-defined) variable names and what types of data they can contain (floats, strings, ints, bools), I need to have ace show the user that assigning an int to a string-type variable is wrong.

I am currently parsing the json and adding identifiers to the lexer with a regex showing options of specific variable names like so:

"Lex" : 
    "identifier_float" : "R::/(x1|xn|y1|yn|z1|zn|grid_motion_rate|pert_rms|pert_peak|pert_band|TKE)/"

And in the syntax part I state that the correct syntax for float-type variables is:

"Syntax":
    "literal_property_float"   : "(identifier_float) '=' (float | number)"

(where float and number are defined in Lex)

Ace seems to understand that I require '=' and then a float/number, but it doesn't recognize the identifier_float name requirement.

How do I use ace-grammar to have Ace understand that a given variable name has a given data-type requirement?

If this isn't the correct place to ask a question, please point me in the right direction.

foo123 commented 8 years ago

Hi, no this would be the correct place to ask the question.

However since i dont quite understand, is it possible to provide the json grammar (with all the extra rules you add) plus an example code and see it (you can use the live example page here to demostrate the code) (plus attach both the gramar and an example code in this issue, so i can test it myself and understand what is meant)

foo123 commented 8 years ago

On the other hand, if i understand correctly, what you need is a hard error (as is done in the xml example in readme page) when an assignment is made which does not match the type, so in this case sth like the following should do (adjust as needed for your other cases):

"Style": {
"error"    : "invalid"
},

"Lex": {
"other"       : "R::/[\\S]+/",
"type_not_match:error"   : "Variable type does not match"
},

"Syntax": {
"literal_property_float"  : "identifier_float '=' (float | number | other.error type_not_match)"
}

Where now you generate a hard error (type_not_match error) plus also highlight that part of code as invalid (error). The "other" token is any token you like as long as it can match code that will not be matched by the first valid cases (i.e float, number).

ngreco commented 8 years ago

My grammar is a restrictive subset of Lua tables, I would say. I generate my grammar from given json schema files that are in a known format. schemaExample.txt

An example json schema:

{
    "grid": {
        "px": {
            "type": "integer",
            "defaultValue": 4,
            "description": "number of x processors",
            "options": [],
            "score": 1000
        },
        "py": {
            "type": "integer",
            "defaultValue": 4,
            "description": "number of y processors",
            "options": [],
            "score": 1000
        },
        "pz": {
            "type": "integer",
            "defaultValue": 4,
            "description": "number of z processors",
            "options": [],
            "score": 1000
        },
        "storeSphere": {
            "type": "logical",
            "defaultValue": false,
            "description": "Compute and store spherical coordinates?",
            "options": [],
            "score": 1000
        },
        "meshfile": {
            "type": "string",
            "stringLength": 128,
            "defaultValue": "Mesh.grid",
            "description": "Name of the mesh to be used",
            "options": [],
            "score": 1000
        },
        "meshkind": {
            "type": "string",
            "stringLength": 128,
            "defaultValue": "read",
            "description": "Add options here: cyl, curv, read, etc.",
            "options": ["cyl","curv","read"],
            "score": 1000
        },
        "moving_grid": {
            "type": "logical",
            "defaultValue": false,
            "description": "use moving mesh?",
            "options": [],
            "score": 1000
        }
    }
}

Given this, I know that, for example, moving_grid is a logical type so true and false are ok, but anything else assigned to it would be an error.

foo123 commented 8 years ago

Does the second answer i gave, address this request or not? If i understand correctly it does, you just need a way for anything else except the types relevant to a declaration to be marked as invalid

ngreco commented 8 years ago

It helps with the error, but the variables that I generate (listed in identifier_float) aren't being recognized. As in, Ace currently is just looking for = (float | number), and if I add one of the variables in identifier_float (or anything else, for that matter) before the = I receive an error.

I'm not sure why my program is giving me errors when I try to put something before the = sign.

foo123 commented 8 years ago

Can you provide your complete ace-grammar and some sample code so i can test it? Attach them in a comment here.

ngreco commented 8 years ago

ace_grammar.js.txt demo.js.txt index.html.txt schema.js.txt

This is what I'm working with. Sorry for the delay, and thanks for your help.

foo123 commented 8 years ago

Yes i see. There are a couple of errors i see by just inspecting the code in schema.js where you build the grammar.

  1. The RegExpID is RE:: (schema.js line 168) while in the code you use R:: (schema.js lines 129,135,140,145)
  2. You use identifier_float both as a lexical token and as action token lex["identifier_float:error"] = "I'm confused."; (schema.js line 131). Each token ID should be unique, just use another token ID, in this case you completely override the original identifier_float lexical token with the action token of same name
  3. Better to add syntax-like rules in syntax part instead of lex part (e.g schema.js lines 128). In fact you dont need to use a syntax-like token here, just add all the keywords/variable names as an array, The grammar will take care to transform them into a lexical token. For example lex.identifier_float = ["px","py","x1","y1"];

This is what i see, and this is why it doesnt work as expected. Try to fix these and let me know if you have further issues (read the grammar reference manual if you need more help with the grammar definition)

If you fix the problem, close this issue as well. Cheers

ngreco commented 8 years ago

Ah, my oversights. Thank you so much! I got rid of those erroneous regex lines and replaced them with the arrays for the lexer (good idea!). For some reason I was under the impression that a lexical token could be used as an action token as well, thanks for clearing that up. I'll take a look at the reference manual for more precise error messages and the like. I think this question has been answered now. Thanks again!