Closed GoogleCodeExporter closed 9 years ago
Original comment by alr...@google.com
on 4 Aug 2011 at 8:07
[deleted comment]
[deleted comment]
In the grammar the terminal STRING is overridden, but there is validation built
into XText / EMF / Antlr that ignores the actual grammar for String. As such,
the terminal STRING cannot be modified to accomodate escaped unicode characters
- an alternate terminal must be created.
Also, we need to keep the override on the STRING terminal to ensure that it
matches BEFORE the UNICODE_STRING terminal because things such as importURI
require that the terminal be a STRING to work correctly...
My Solution is to modify Protobuf.xtext as follows:
StringRef:
string=UNICODE_STRING | string=STRING;
terminal STRING:
SL_STRING (SL_STRING)*;
terminal SL_STRING:
'"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | '"'))* '"' (WS)*;
terminal UNICODE_STRING:
UNICODE_SL_STRING (UNICODE_SL_STRING)*;
terminal UNICODE_SL_STRING:
'"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\' | UNICODE_OCTAL) | !('\\' | '"'))* '"' (WS)*;
terminal UNICODE_OCTAL:
('0'..'3')('0'..'7')('0'..'7');
Which will allow octal unicode characters to be included in a string.
(on a side note... can anyone enlighten me on the purpose of STRING vs
SL_STRING?)
The UNICODE_STRING terminal must then have syntax highlighting added so that
it's treated just like any other string...
So in the com.google.eclipse.protobuf.ui project the class
com.google.eclipse.protobuf.ui.editor.syntaxcoloring.ProtobufSemanticHighlightin
gCalculator
The following methods need updates...
private void highlightOptions(Field field, IHighlightedPositionAcceptor acceptor) {
for (FieldOption option : field.getFieldOptions()) {
String highlightId = fieldOptions.isDefaultValueOption(option) ? KEYWORD_ID : DEFAULT_ID;
highlightName(option, acceptor, highlightId);
ValueRef ref = option.getValue();
if (ref instanceof LiteralRef) {
highlightFirstFeature(option, FIELD_OPTION__VALUE, acceptor, ENUM_LITERAL_ID);
return;
}
if (ref instanceof NumberRef) {
highlightFirstFeature(option, FIELD_OPTION__VALUE, acceptor, NUMBER_ID);
return;
}
if (ref instanceof StringRef) {
highlightFirstFeature(option, FIELD_OPTION__VALUE, acceptor, STRING_ID);
}
}
}
private void highlight(Option option, IHighlightedPositionAcceptor acceptor) {
highlightName(option, acceptor, DEFAULT_ID);
ValueRef ref = option.getValue();
if (ref instanceof LiteralRef) {
highlightFirstFeature(option, OPTION__VALUE, acceptor, ENUM_LITERAL_ID);
return;
}
if (ref instanceof NumberRef) {
highlightFirstFeature(option, OPTION__VALUE, acceptor, NUMBER_ID);
return;
}
if (ref instanceof StringRef) {
highlightFirstFeature(option, OPTION__VALUE, acceptor, STRING_ID);
}
}
I've attached the modified files and a screenshot of the working octal unicode
strings...
Not sure of any other implications from not using the built-in STRING terminal
rule but I searched around for code references and couldn't find any other than
the lack of highlighting.
PS: I don't know how to run all the tests in the test projects - so I can't
confirm they still all pass with this change.
Original comment by compuwar...@gmail.com
on 4 Aug 2011 at 9:20
Attachments:
I've been looking at the unit test protos using protobuf-dt and noticed that
this likely overlaps with a "failed" test case right now in unittest.proto (and
others).
ByteStrings use escaped bytes but the 'hard coded' EMF STRING type prevents
those from working quite right as well.
unittest.proto line 467
message TestExtremeDefaultValues {
optional bytes escaped_bytes = 1 [default = "\0\001\a\b\f\n\r\t\v\\\'\"\xfe"];
results in a syntax error as the string escaping does not work as expected by
protoc
Original comment by compuwar...@gmail.com
on 25 Aug 2011 at 3:50
The prior suggested solution can be updated to account for the information in
the unit test cases as such:
And UNICODE_SL_STRING needs to be updated (maybe this should be PROTO_SL_STRING
or something better named?)
terminal UNICODE_SL_STRING:
'"' ('\\' ('a' | 'b' | 't' | 'n' | 'f' | 'r' | 'u' | 'v' | 'x' (NUMBER | 'a'..'f' | 'A'..'F') (NUMBER | 'a'..'f' | 'A'..'F') | '"' | "'" | '\\' | UNICODE_OCTAL) | !('\\' | '"'))* '"' (WS)*;
which now includes the escape characters used in the unit test 'a', 'v', and
hexadecimal character notation 'x##' where # is 0-F
Also, the previously suggested solution is flawed in that terminal call
UNICODE_OCTAL as it can match terminals outside the scope of a string.
the correct replacement is to fold that into the UNICODE_SL_STRING terminal as
such:
terminal UNICODE_SL_STRING:
'"' ('\\' ('a' | 'b' | 't' | 'n' | 'f' | 'r' | 'u' | 'v' | 'x' (NUMBER | 'a'..'f' | 'A'..'F') (NUMBER | 'a'..'f' | 'A'..'F') | '"' | "'" | '\\' | ('0'..'3')('0'..'7')('0'..'7')) | !('\\' | '"'))* '"' (WS)*;
Original comment by compuwar...@gmail.com
on 25 Aug 2011 at 4:22
r6a51f6bb3f25
Original comment by alr...@google.com
on 9 Sep 2011 at 8:02
Original issue reported on code.google.com by
compuwar...@gmail.com
on 4 Aug 2011 at 3:07