Closed averbraeck closed 11 months ago
Variable names in the scenario should have a restricted string type like the above, but start and end with { and }. Otherwise, a variable name could be FUNC()
which would cause a major problem in the evaluator.
We could possibly allow Ids to start with a number, or just be a number, so you can make id's 1, 2,, 3, etc. We have to see how strict we want to make the restriction. There is nothing wrong with a number as an Id, whereas an id like {{}}
would cause real issues.
There is no technical reason to be very strict with Id's (although we may be more strict than technically required). Let's work out the example of dynamic link nodes. First we have two nodes with confusing but technically functioning Id's:
Node[1].Id=[km/h]
Node[2].Id=PI()
We define a link that will dynamically start from either node:
Link.StartNode={my_var}
For this an input parameter needs to be specified in two scenarios:
Scenario[1].InputParameterString.Id={my_var}
Scenario[1].InputParameterString.Value=[km/h]
Scenario[2].InputParameterString.Id={my_var}
Scenario[2].InputParameterString.Value=PI()
The reason this works is that not all fields will be parsed as a type that will then evaluate an expression.
Field | XML type | Parser type | Remark |
Node.Id | ots:IdType | String | String, so no expression is evaluated in the parser. |
Link.StartNode | ots:string | StringType | With expression. xsd:keyref checks whether this is a Node.Id or InputParameterString.Id. |
Scenario.InputParameterString.Id | ots:InputParameterIdType | String | InputParameterIdType forces { } but will be parsed as String; no expression is evaluated. |
Scenario.InputParameterString.Value | xsd:String | String | This value results from an expression. When used to ref to an Id it should obey IdType, but input parameters may be used for other purposes. |
We can see that [km/h]
and PI()
will never be evaluated as an expression. They only result from an expression through being the value in an input variable of the expression. Note that all the above will still work if the node Id's were {[km/h]}
and {PI()}
because these values are never evaluated as an expression.
Still, the only real technical restriction is this that Id's should not be recognizable as an expression, i.e. not start with { and not end with }. This is due to a normal Id reference. A simple example shows this does not work:
Node.Id={PI()}
Link.StartNode={PI()}
<-- oops, recognized as an expression while directly referring to a node
So therefore, no curly braces:
Node.Id=[km/h]
Link.StartNode=[km/h]
Still, to avoid confusion, I would be in favor of not allowing any sort of brackets. Numbers for Id's seem very logical and may be very helpful when parsing from an external network format. In understanding or communicating between formats/programs, having the same node Id's is helpful. The pattern would then become:
<xsd:pattern value="[A-Za-z0-9_\-\.%!@#\^*&:;\\/><?]+"></xsd:pattern>
Note that <
and &
are not allowed other than as <
and &
as that will not work in XML.
The characters to escape are, by the way, according to https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#cite_ref-semicolon_2-0, and https://www.ibm.com/docs/en/was-liberty/base?topic=SSEQTP_liberty/com.ibm.websphere.wlp.doc/ae/rwlp_xml_escape.htm:
Original character | Escaped character |
---|---|
" | \" |
' | \' |
< | \< |
> | \> |
& | \& |
I agree that the Id fields do not pose a problem, and we can allow almost anything, where starting and ending with curly braces is the only clash in the parsing.
What about the name of a scenario parameter that will be used as part of an expression that will be evaluated? Suppose I call such a scenario parameter FUN()
(possibly defined with its curly braces as {FUN()}
depending on whether you want to separate key/keyref parameters and numerical parameters), and I use an expression for a numerical entry: {2 * FUN()}
. The evaluator will try to call function FUN() and not use the value of the parameter with that name. It becomes even more clear when you use PI()
as the scenario parameter name with a value of -1 in a scenario. How should the evaluator know whether to use your PI()
value, or use the built-in PI()
function?
Good point. We have to look at how the expression evaluator recognizes variables. Or more to the point, how it recognizes its demarcation. Based on org.djutils.eval.Eval.evalLhs()
the first character needs to be such that Character.isLetter(char)
holds. Based on org.djutils.eval.Eval.handleFunctionOrVariableOrNamedConstant()
the variable name continues while Character.isLetterOrDigit(c) || '.' == c || '_' == c
. This would bring the pattern down to:
<xsd:simpleType name="InputParameterIdType">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\{[A-Za-z][A-Za-z0-9_\.]*\}"></xsd:pattern>
</xsd:restriction>
</xsd:simpleType>
Perhaps the expression evaluator could allow more, such as @
and #
. Both as first character, and anywhere within the name. It could also allow _
as a first character. The first character can never be .
, as a number is then recognized. If the expression evaluator allows more, we could also allow more. Note however that the following characters are also not allowed given their usage in org.djutils.eval.Eval.evalRhs()
: )
, ^
, *
, /
, +
, -
, &
, |
, <
, >
, =
, !
, ?
, :
and ,
. Lastly, (
is not allowed as it will be recognized as a function rather than a variable. If the expression evaluator will allow more, the full pattern would become:
<xsd:simpleType name="InputParameterIdType">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\{[A-Za-z_@#%][A-Za-z0-9_@#%\.]*\}"></xsd:pattern>
</xsd:restriction>
</xsd:simpleType>
Agree. And indeed, when a variable name would be /k
or val*
it could not denote the difference between the '/' or '*' being part of the variable name or the formula. A variable with id a+b
where also variables with id a
and b
exist, would be ambiguous, and should therefore not be allowed. For me, the above pattern works. A couple of challenging unit tests should check if no ambiguities remain. @OTSim can for sure think of a few!
Id
attributes are now of ots:IdType
, which prohibits the use of {
and }
.Id
of input paramters are of type ots:InputParameterIdType
with pattern \{[A-Za-z][A-Za-z0-9_\.]*\}
.KeyValidator
class of the editor no longer accepts all expressions.xsd:selector
in all xsd:key
are amended with |.//ots:DefaultInputParameters/ots:String
, which means they may also refer to a String
input parameter. This was done by replacing expression (<xsd:key name="[^"]+">\s*<xsd:selector xpath="[^"]+)("\s*\/>\s*<xsd:field xpath="@Id"\s*\/>)
with $1|.//ots:DefaultInputParameters/ots:String$2
.Note that dynamically referring to an input parameter will only work for keys that are on the field Id
, as that same field is also referred to in an input parameter.
Right now, a name for an
Id
field of, e.g., aNode
can be{{{{
or{PI()}
which will cause major problems later when a reference to this Id is made in, e.g., the definition of aLink
. The current definition for anyId
in the OTS XSD's is:The same holds for a variable name that is used as an input name for an expression variable in a scenario.
We have to see what characters to include. I would like any variable to start with a letter, and avoid:
PI()
would be very confusing;[km/h]
would be very confusing;{}}}
would be very confusing and would lead immediately to parsing errors if used as a keyref;Definition of an
IdType
could be something like:forcing the Id to start with a letter and have at least one character, and allowing for letters, digits and the special characters:
_-.%!@#^
. We could also allow *, &, :, ;, /, >, < and ?. We might not allow single and double quotes in a variable name.