Closed masatake closed 2 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 85.42%. Comparing base (
d8ffdbd
) to head (8778fd3
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
~See the commit log; -j1 is needed.~
@dolik-rce, I got an interesting result.
When turning on Pegof, all cases testing the Thrift peg-based parser fail.
You can run the test locally:
bash autogen.sh
./configure --with-pegof=../build/pegof
make
make units LANGUAGES=Thrift
You can compare Units/parser-thrift.r/simple-thrift.d/expected.tags and Units/parser-thrift.r/simple-thrift.d/FILTERED.tmp.
peg file optimized by Pegof can be found in peg/thrift.pego.
Most definitely a bug in pegof. I'll try to investigate, but it's probably going to take few days at least, I won't have much time available next week.
I've been able to pinpoint the optimization that causes the problem (or maybe just first of the problems). It happens when Function
rule is inlined:
--- peg/thrift.peg 2024-06-30 14:28:20.400662868 +0200
+++ peg/thrift.peg 2024-06-30 14:28:20.400662868 +0200
@@ -148,32 +148,33 @@
tagEntryInfo *e = getEntryInCorkQueue (r);
if (e)
e->extensionFields.inheritance = eStrdup ($2); } __
- )? __ "{" __ (Function __)* (
- "}"
- / EndOfServiceError
- ) _ TypeAnnotations? EOS { POP_SCOPE (auxil); }
-
-EndOfServiceError <- .
-
-Function <-
- ("oneway" __)? <
- "void"
- / FieldType
-> __ <Identifier> { int r = makeThriftTag (auxil, $2, $2s, K_FUNCTION, true);
+ )? __ "{" __ (
+ (
+ ("oneway" __)? <
+ "void"
+ / FieldType
+ > __ <Identifier> { int r = makeThriftTag (auxil, $4, $4s, K_FUNCTION, true);
tagEntryInfo *e = getEntryInCorkQueue (r);
if (e)
{
e->extensionFields.typeRef[0] = eStrdup ("typename");
- e->extensionFields.typeRef[1] = eStrdup ($1);
+ e->extensionFields.typeRef[1] = eStrdup ($3);
}
PUSH_KIND (auxil, K_PARAMETER); } _ <"(" __ FieldList ")"> { int r = BASE_SCOPE (auxil);
tagEntryInfo *e = getEntryInCorkQueue (r);
if (e)
- e->extensionFields.signature = eStrdup ($3); } __ (
- "throws" { PUSH_KIND (auxil, K_THROWSPARAM); } __ <"(" __ FieldList ")"> { int r = BASE_SCOPE (auxil);
- attachParserFieldToCorkEntry (r, ThriftFields[F_THROWS].ftype, $4);
+ e->extensionFields.signature = eStrdup ($5); } __ (
+ "throws" { PUSH_KIND (auxil, K_THROWSPARAM); } __ <"(" __ FieldList ")"> { int r = BASE_SCOPE (auxil);
+ attachParserFieldToCorkEntry (r, ThriftFields[F_THROWS].ftype, $6);
POP_KIND (auxil, false); }
- )? _ TypeAnnotations? ListSeparator? { POP_KIND (auxil, true); }
+ )? _ TypeAnnotations? ListSeparator? { POP_KIND (auxil, true); }
+ ) __
+ )* (
+ "}"
+ / EndOfServiceError
+ ) _ TypeAnnotations? EOS { POP_SCOPE (auxil); }
+
+EndOfServiceError <- .
FieldType <-
(
I'm still not sure why, the result looks good to me, even the references seem to be correctly renumbered.
The parser extract "a.thrift" from
include "a.thrift"
. However, it didn't.
The parser extract "a.thrift" from
include "a.thrift"
However, it didn't.
The include paths must be specified, fot include statement to work correctly (unless it is in one of the default directories), similar to packcc
. See https://github.com/dolik-rce/pegof?tab=readme-ov-file#inputoutput-options, option -I/--import
, or specify PCC_IMPORT_PATH
variable. This should work.
By the way: Pegof can be also used as drop-in replacement for packcc. When you run pegof -p -O all ...
it optimizes the grammar and generates the header and source files for the optimized parser (via the embeded packcc code). It might be simpler to implement this into the ctags build system, as it requires no additional steps and preprocessed files. On the other hand, it might be more difficult to debug.
I didn't write about "include" or "import" feature of packcc or pegof. I wrote about the ability of the code optimized by pegof.
input.thrift:
include "a.thrift"
The original thrift parser can extract "a.thrift" as a reference tag. ~However, the optimized version cannot do it.~ My mistake. the optimized version can also extract it.
I was looking for smaller input with that the optimized version doesn't work well. Smaller input makes debugging easier.
$ ctags --list-features| grep pegof
$ cat /tmp/input.thrift
include "a.thrift"
$ ctags -o - --extras=+r input.thrift
a.thrift input.thrift /^include "a.thrift"$/;" T
$~/bin/ctags --list-features| grep pegof
pegof makes peg based parser(s) optimized (experimental)
$ ~/bin/ctags -o - --extras=+r input.thrift
a.thrift input.thrift /^include "a.thrift"$/;" T
I will report a smaller input reproducing the issue here.
The problem should be fixed now (in https://github.com/dolik-rce/pegof/commit/2ba466796bf372a3151b7c84e2c1bba8f5225537). All unit tests in this branch are passing with pegof 0.4.2.
@dolik-rce, thank you very much.
I added a document about how to use pegof to build ctags.
Pegof is a PEG grammar optimizer and formatter developed at https://github.com/dolik-rce/pegof.
This change is for utilizing pegof for optimizing our peg based parsers.
We assume pegof is installed at ${somewhere}/pegof.
$ ./autogen.sh $ ./configure ... --with-pegof=${somewhere}/pegof ... $ make
To verify the command line invoking pegof, pass V=1 option to make like:
$ make V=1
Makefile runs pegof to generate a .pego file from the given .peg file. Then, it runs packcc to generate .c and .h files from the .pego file. Via make command, you can generate a specified pego file directly. e.g.
$ make V=1 peg peg/varlink.pego
This may be helpful for debugging pegof.
If you successfully build ctags with pegof, you will see "pegof" in the output of ./ctags—-list-features.
TODO: