metamath / metamath-exe

Metamath program - source code for the Metamath executable
GNU General Public License v2.0
77 stars 25 forks source link

Add test for nested comments. #84

Closed jkingdon closed 2 years ago

jkingdon commented 2 years ago

As discussed on the mailing list.

benjub commented 2 years ago

So I guess this settles the alternative given in https://github.com/metamath/metamath-book/issues/233. I'll modify mmverify.py to raise on $( $( $) (only one line to add). As for the EBNF grammar with the proposal I described there, I don't remember if we edit metamath.tex or leave it that way and edit the errata.md (or edit both).

jkingdon commented 2 years ago

So I guess this settles the alternative given in metamath/metamath-book#233. I'll modify mmverify.py to raise on $( $( $) (only one line to add). As for the EBNF grammar with the proposal I described there, I don't remember if we edit metamath.tex or leave it that way and edit the errata.md (or edit both).

To the extent that I was thinking about this at any level deeper than asking what metamath has always (for some value of always) done, I was just thinking that $( inside a comment is usually a mistake. But if people want to change the implementation/test to match the spec in the book, well we could discuss that although I guess the current implementation seems like the practical choice to me.

tirix commented 2 years ago

BTW @jkingdon, which mailing list post were you referring to initially? I see in this post from Mario :

Regular comments, parsed with the standard metamath lexer (i.e. whitespace separated everything) have the form '$(' token* '$)' where no token contains an embedded occurrence of '$)' or '$('.

Is that what you are referring to?

jkingdon commented 2 years ago

BTW @jkingdon, which mailing list post were you referring to initially?

I was thinking of https://groups.google.com/g/metamath/c/DDvCuVRev0M/m/fCo-hzvCAwAJ which is I think the message right above the one you linked to, which is almost written in test case format in the sense that it says in that "$( $( $)" returns an error in metamath.c.

benjub commented 2 years ago

So I guess this settles the alternative given in metamath/metamath-book#233 https://github.com/metamath/metamath-book/issues/233. I'll modify mmverify.py to raise on $( $( $) (only one line to add). As for the EBNF grammar with the proposal I described there, I don't remember if we edit metamath.tex or leave it that way and edit the errata.md (or edit both).

To the extent that I was thinking about this at any level deeper than asking what metamath has always (for some value of always) done, I was just thinking that $( inside a comment is usually a mistake. But if people want to change the implementation/test to match the spec in the book, well we could discuss that although I guess the current implementation seems like the practical choice to me.

My proposal above is rather to change the EBNF grammar in the book (and mmverify.py) to match metamath.c. For mmverify.py, I can add that extra one-line check soon. For the book, I'll edit errata.md ? Also metamath.tex or not ?

Also: the EBNF grammar and mmverify.py still won't match exactly the behavior of metamath.c described by @digama0: metamath.c detects opening/closing tags within words, e.g. $( a$( $) will trigger an error. This would need a little bit more change in the EBNF grammar, and the separation lexing/parsing would be less clear, so maybe we could be content with my above proposal and not go further. We could consider metamath.c's behavior as an extra "bonus" check that tells the user he might have forgotten a space. I can implement that check in mmverify.py by simply adding ìf '$(' in tok or '$)' in tok: raise ... in the while-loop of the method readc(). Message ID: @.***>

benjub commented 2 years ago

Fixed mmverify.py to be closer to metamath.c's behavior (see https://github.com/david-a-wheeler/mmverify.py/pull/14). Now, my two questions remain:

digama0 commented 2 years ago

I think that the rule should be the one implemented in metamath.c: a comment must not contain a token that has a $( or $) substring.

benjub commented 2 years ago

I think that the rule should be the one implemented in metamath.c: a comment must not contain a token that has a $( or $) substring.

Ahh, you drive a hard bargain ! But fair enough. What about:

_COMMENT ::= '$(' _WHITECHAR _COMMENT-CONTENT _WHITECHAR '$)' _WHITECHAR
_COMMENT-CONTENT ::= (_WHITECHAR | _PRINTABLE-CHARACTER)* -
      ((_WHITECHAR | _PRINTABLE-CHARACTER)* ('$(' | '$)') (_WHITECHAR | _PRINTABLE-CHARACTER)*)

This looks a bit complex. Or you have another suggestion ? (Rk: if the file ends with a comment, there could be no trailing whitechar... These border cases are always annoying, but maybe there is a more elegant way to deal with them?)

Edit: or maybe,

_COMMENT ::= '$(' (_WHITECHAR+ PRINTABLE-SEQUENCE* _WHITECHAR+ '$)' _WHITECHAR
PRINTABLE-SEQUENCE ::= _PRINTABLE-CHARACTER+ - (_PRINTABLE-CHARACTER* ('$(' | '$)') _PRINTABLE-CHARACTER)*

(instead of the current production rule for PRINTABLE-SEQUENCE)

david-a-wheeler commented 2 years ago

Who knew there was a price for formal exactness :-) ?

david-a-wheeler commented 2 years ago

If you want a change to the spec in the book, that needs to be here: https://github.com/metamath/metamath-book

benjub commented 2 years ago

If you want a change to the spec in the book, that needs to be here: https://github.com/metamath/metamath-book

@david-a-wheeler : As I wrote above, I will do a PR, but as I asked above, I would like to know before if I should modify errata.md or metamath.tex or both ? (I'm going to move the conversation to: https://github.com/metamath/metamath-book/issues/233, or open another issue.)

david-a-wheeler commented 2 years ago

@benjub - at a *minimum, do errata.md. Bonus points if you also edit metamath.tex, but that should probably be a separate PR.

No one expected Norm to pass so soon, so book update plans are, well, unknown. Thankfully the licensing means that we can do an update without licensing problems.

It might be good to have a longer discussion how when/how to update the book.