Closed mingodad closed 1 year ago
Trying to rebuild the parser I noticed it doesn't build because of return new StructuredQName("p", "de/bottlecaps/railroad/core/Parser", functionName());}
:
--- rr0/src/main/java/de/bottlecaps/railroad/core/Parser.java
+++ rr/parser/Parser.java
@@ -1,4 +1,4 @@
-// This file was generated on Tue Jan 31, 2023 11:09 (UTC+01) by REx v5.56 which is Copyright (c) 1979-2023 by Gunther Rademacher <grd@gmx.net>
+// This file was generated on Tue Jan 31, 2023 10:48 (UTC+01) by REx v5.56 which is Copyright (c) 1979-2023 by Gunther Rademacher <grd@gmx.net>
// REx command line: Parser.ebnf -java -tree -saxon10 -name de.bottlecaps.railroad.core.Parser
package de.bottlecaps.railroad.core;
@@ -307,7 +307,7 @@
abstract Sequence execute(XPathContext context, String input) throws XPathException;
@Override
- public StructuredQName getFunctionQName() {return new StructuredQName("p", "de/bottlecaps/railroad/core/Parser", functionName());}
+ public StructuredQName getFunctionQName() {return new StructuredQName("p", "Parser", functionName());}
@Override
public SequenceType[] getArgumentTypes() {return new SequenceType[] {SequenceType.SINGLE_STRING};}
@Override
Doing a manual replacement of "de/bottlecaps/railroad/core/Parser"
by "Parser"
then it build.
You are referring to the XQuery grammar. Yes, XQuery provides escaping for single and double quotes. But that is XQuery, not grammars.
The relevant W3C definition for EBNF is this:
"string" matches the sequence of characters that appear inside the double quotes.
'string' matches the sequence of characters that appear inside the single quotes.
RR's grammar syntax here matches that definition. It is also aligned with the syntax of REx. There are no plans to extend it.
If you really need to combine single and double quotes as content in a single terminal, please define a lexical rule for it,
QuotedApostrophe
::= '"' "'" '"'
/* ws:explicit */
Also see #6.
Thanks for reply ! I did reported incorrectly the issue with rebuilding the parser, it does build but then when trying to run I'm getting:
java -jar rr.war test-dq.ebnf > test-dq.ebnf.xhtml
Static error on line 43 column 39 of basic-interface.xq:
XPST0017 Cannot find a 1-argument function named Q{Parser}parse-Grammar()
Exception in thread "main" java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at de.bottlecaps.fatjar.Loader.main(Loader.java:70)
Caused by: net.sf.saxon.s9api.SaxonApiException: Cannot find a 1-argument function named Q{Parser}parse-Grammar()
at net.sf.saxon.s9api.XQueryCompiler.compile(XQueryCompiler.java:566)
at de.bottlecaps.railroad.RailroadGenerator.generate(RailroadGenerator.java:120)
at de.bottlecaps.railroad.Railroad.main(Railroad.java:239)
... 5 more
Caused by: net.sf.saxon.trans.XPathException: Cannot find a 1-argument function named Q{Parser}parse-Grammar()
at net.sf.saxon.query.UnboundFunctionLibrary.bindUnboundFunctionReferences(UnboundFunctionLibrary.java:178)
at net.sf.saxon.query.QueryModule.bindUnboundFunctionCalls(QueryModule.java:1178)
at net.sf.saxon.expr.instruct.Executable.fixupQueryModules(Executable.java:437)
at net.sf.saxon.query.XQueryParser.makeXQueryExpression(XQueryParser.java:177)
at net.sf.saxon.query.StaticQueryContext.compileQuery(StaticQueryContext.java:568)
at net.sf.saxon.s9api.XQueryCompiler.compile(XQueryCompiler.java:562)
... 7 more
I'm trying to generate EBNF
grammars from tree-sitter
grammars and convert the patterns/regexp
to strings for showing then on railroad diagrams and they can have single/double
quotes inside then like:
/"(""|[^"])*"/
/([^\s\\.\"\(\)\{\}@\'\\_]|\\[^\sa-zA-Z]|_[^\s;\.\"\(\)\{\}@])[^\s;\.\"\(\)\{\}@]*/
/(([^\s;\.\"\(\)\{\}@\'\\_]|\\[^\sa-zA-Z]|_[^\s;\.\"\(\)\{\}@])[^\s;\.\"\(\)\{\}@]*\.)*([^\s;\.\"\(\)\{\}@\'\\_]|\\[^\sa-zA-Z]|_[^\s;\.\"\(\)\{\}@])[^\s;\.\"\(\)\{\}@]*/
/[^#'"<>{}\[\]()`$|&;\\\s]/
/"([^"\\]|\\.)*"|'([^'\\]|\\.)*'/
/['"]/
/[^;\\'"]/
/[^()#"\\']/
/\\(u\{[0-9A-Fa-f]{4,6}\}|[nrt\"'\\])/
/\\(u\{[^}]*\}|[^nrt\"'\\])/
/([^?# \n\s\f()\[\]'`,\\";]|\\.)([^# \n\s\f()\[\]'`,\\";]|\\.)*/
...
I'm not sure if QuotedApostrophe ::= '"' "'" '"' /* ws:explicit */
it's enough, can you give a working rr/src/main/java/de/bottlecaps/railroad/core/Parser.ebnf
with the changes to allow it ?
I've tried this changes and it build and run but doesn't show the expected output:
StringLiteral ::= '"' ('""' | [^"#x9#xA#xD])* '"'
| "'" ("''" | [^'#x9#xA#xD])* "'"
Thanks for reply ! I did reported incorrectly the issue with rebuilding the parser, it does build but then when trying to run I'm getting: ...
Thanks for letting me know. Now fixed with 517c1f934faa6a2d6bb93d9354f3ce1104d42ab5
I'm testing with this modified grammar that only add MixedStringLiteral ::= '"a quoted ''string''"' /* ws: explicit */
:
/* extracted from https://www.bottlecaps.de/rr/ui on Tue Jan 31, 2023, 10:03 (UTC+01)
*/
Grammar ::= Production*
Production
::= NCName '::=' ( Choice | Link )
NCName ::= [http://www.w3.org/TR/xml-names/#NT-NCName]
Choice ::= SequenceOrDifference ( '|' SequenceOrDifference )*
SequenceOrDifference
::= (Item ( '-' Item | Item* ))?
Item ::= Primary ( '?' | '*' | '+' )*
Primary ::= NCName | StringLiteral | CharCode | CharClass | '(' Choice ')'
StringLiteral
::= '"' [^"]* '"' | "'" [^']* "'"
/* ws: explicit */
MixedStringLiteral
::= '"a quoted ''string''"'
/* ws: explicit */
CharCode ::= '#x' [0-9a-fA-F]+
/* ws: explicit */
CharClass
::= '[' '^'? ( Char | CharCode | CharRange | CharCodeRange )+ ']'
/* ws: explicit */
Char ::= [http://www.w3.org/TR/xml#NT-Char]
CharRange
::= Char '-' ( Char - ']' )
/* ws: explicit */
CharCodeRange
::= CharCode '-' CharCode
/* ws: explicit */
Link ::= '[' URL ']'
URL ::= [^#x5D:/?#]+ '://' [^#x5D#]+ ('#' NCName)?
/* ws: explicit */
Whitespace
::= S | Comment
S ::= #x9 | #xA | #xD | #x20
Comment ::= '/*' ( [^*] | '*'+ [^*/] )* '*'* '*/'
/* ws: explicit */
With the current rr
it does show the MixedStringLiteral
but with my modified parser grammar:
StringLiteral ::= '"' ('""' | [^"#x9#xA#xD])* '"'
| "'" ("''" | [^'#x9#xA#xD])* "'"
The MixedStringLiteral
nonterminal simply disappear from the ouptut without any error message.
I'm not sure if
QuotedApostrophe ::= '"' "'" '"' /* ws:explicit */
it's enough,
It is definitely enough, as far as the syntax is concerned. But the processing logic needs to be adapted as well, i.e. you need to at least unescape these quotes when getting the literal's net content, and also take care of the escaping, when serializing it back to its quoted representation.
The
MixedStringLiteral
nonterminal simply disappear from the ouptut without any error message.
This is an effect of the Inline literals
option.
Thank you again for all your help ! But it's strange that you do not recognize the need to have a way to escape single/double quotes to allow have both of then inside strings (it's normally a common functionality in any programming language).
Looking the grammar for strings there is now escape sequence for strings that contains both
single/double quotes
but theW3C
grammar does have it: