Closed nxg closed 2 years ago
On Sat, Dec 18, 2021 at 06:56:16AM -0800, Norman Gray wrote:
With the addition of the ‘dimensionless’ marker
"1"
, the string1m
has become ambiguous, since the1
is lexed as the dimensionless marker rather than avoufloat
(which, recall, is either0.[0-9]+([eE][+-]?[0-9]+)?
or[1-9][0-9]*(\.[0-9]+)?([eE][+-]?[0-9]+)?
).Proposed change: either:
- change the definition of the
voufloat
to be[0-9]+\.[0-9]+([eE][+-]?[0-9]+)?
(ie, requiring a decimal point), or to this plus[02-9][0-9]*([eE][+-]?[0-9]+)?
(ie, floats may omit a fractional part only if the integer part is not 1); or
I don't like this. Writing 1e-10m is rather natural; I'm personally pushing out quite a few units like this already, and if we break things in a minor update (which is, of course, bending rules already), we need to have a very strong reason. Which I think we don't have here.
- regard this as a parsing bug and fix the lexer/grammar to handle this case, and insert rationale into the document explaining why the current pattern for
voufloat
is what it is.
I've not thought this alternative through, but my gut feeling isn't positive either.
But I'm having trouble understanding the problem in the first place. If you say:
empty_unit ::= 1
and say
input ::= empty_unit | complete_expression | scalefactor complete_expression
I'd say all is fine: 1 parses into empty_unit (and cannot be parsed in any other way), 1e-2 and its ilk won't parse at all, and 1m parses into scale_factor 1 and m into complete_expression. What am I missing?
Re 1e-10
: true. I hadn't meant to exclude that, but I think it doesn't matter, because...
Re the grammar: the problem here is/was not so much in the grammar, as in the lexer, in that 1
is lexed as the dimensionless marker rather than a float. But that can in fact be easily fixed in the same way that we handle 10**3m
and 10m
:
scalefactor: LIT10 power numeric_power
| LIT10 // ie, "10"
| LIT1 // ie, "1" <-- this is new
| VOUFLOAT
;
I've just tried that, and it works fine, so I propose that as the fix.
Re the rationale for the VOUFLOAT
regexp: looking at it, and thinking way back, it's designed to forbid the case, meaningless in context, of 0
, and the nearly malformed case of 1.
; so nothing profound, and a line in the text saying that would be straightforward.
Closed in commit 7b967d7
With the addition of the ‘dimensionless’ marker
"1"
, the string1m
has become ambiguous, since the1
is lexed as the dimensionless marker rather than avoufloat
(which, recall, is either0.[0-9]+([eE][+-]?[0-9]+)?
or[1-9][0-9]*(\.[0-9]+)?([eE][+-]?[0-9]+)?
).Proposed change: either:
voufloat
to be[0-9]+\.[0-9]+([eE][+-]?[0-9]+)?
(ie, requiring a decimal point), or to this plus[02-9][0-9]*([eE][+-]?[0-9]+)?
(ie, floats may omit a fractional part only if the integer part is not 1); orvoufloat
is what it is.