Closed tomtau closed 8 months ago
This update introduces a comprehensive enhancement across the board for a JSON parser using the pest
library. It includes a new JSON parser definition, streamlined handling of rule documentation, extended support for grammar and rule documentation, upgraded typed syntax tree functionality, and new traits for improved parsing operations. The changes aim to make the parser more robust, maintainable, and easier to understand.
Files | Summary |
---|---|
derive/tests/json.pest , derive/tests/json.rs |
Introduced a parser for JSON with tests covering basic and structural parsing. |
generator/src/... , meta/src/... , pest/src/... |
Enhanced documentation handling, simplified rule enum generation, and improved typed syntax tree functionality. Added support for grammar and rule documentation. |
generator/src/typed/... , pest/src/typed/... |
Modified parsing logic with new traits and methods for better parsing operations and reachability analysis. |
pest/src/{choice.rs, formatter.rs, lib.rs, sequence.rs} |
Added documentation and attributes, introduced core parts of pest3 , and updated sequence type generation. |
"In the realm of code, where the parsers play,
π A rabbit hopped, crafting JSON's way.
Withpest
in paw, it weaved through the night,
π Enhancing each rule, with documentation bright.
Through sequences, choices, and trees it did hop,
Leaving behind code, that's top of the crop!"
ππ°β¨
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Sorry, I'll switch to that branch later.
@tomtau I've cherry-picked all commits in my hand. But module system hasn't been finished yet, so please merge this later.
@tomtau and could you please explain how the module path (mod1::mod2::mod3::...
) is mapped into the path in file system (dir1/dir2/dir3/...
)?
I found grammar.pest only accepts module path, and parser.rs simply uses module path as file path and ignores built-in modules.
And when the grammar parser sees mod1::mod2
, should it be converted to ./mod1/mod2.pest
? And how could user access files in ancesters folders, say ../common.pest
?
@TheVeryDarkness the pest3 module system isn't yet properly defined, there are two aspects to it: syntax and semantics.
I saw two proposals for it:
use "cool.pest"; use "this.pest" as that;
I think the current pest3 implementation was leaning towards the latter (even though the former was in the original proposal by @dragostis ) which seems fair: it's familiar to Rust users, it doesn't have issues with different system path notations on different operating systems, and can potentially be abstracted to "pest_vm".
And how could user access files in ancesters folders, say ../common.pest?
If we go with that, I guess we can refer to the ancestor modules using the super
as in Rust?
There are multiple issues related to semantics, off the top of my head (there are likely more):
mod {}
syntax and its precedence over filesystemcould you please explain how the module path (mod1::mod2::mod3::...) is mapped into the path in file system (dir1/dir2/dir3/...)? And when the grammar parser sees mod1::mod2, should it be converted to ./mod1/mod2.pest?
I think for this initial prototype, we can go with simple decisions, e.g. the grammar parser sees mod1::mod2
, it can convert it to ./mod1/mod2.pest
, scope trivia operators to each module (if one wants to re-use the trivia operator, they can re-import it explicitly), no visibility operators...
@TheVeryDarkness I rebase-merged this one, so feel free to start a new branch off the latest master!
Sorry, I'm busy these weeks :( Maybe let me implement that feature this or next weekend.
No worries!
@TheVeryDarkness I previously merged that original PR by rebasing it, because I wanted to preserve a linear history without merge commits... but that made a merge conflict between those two master branches, so I created a new branch off the current master (
update-fixes
) and cherry-picked the new commits from your master. How do you prefer to continue working on it going forward? Continue on that diverged master and I'll cherry-pick the new commits, or work directly on a branch?Summary by CodeRabbit
New Features
parse_with
toparse_with_partial
and addedcheck_with_partial
inrepetition.rs
.try_parse_with_partial
,try_parse_partial
,check_with_partial
,check_partial
,parse_partial
toTypedNode
.FullRuleStruct
trait for rules with full capacity.PairContainer
methods for handling pairs and children pairs.PairTree
trait extendingPairContainer
with pair tree operation methods.TypedParser
trait for typed syntax tree production with parsing methods.Enhancements
box_all_rules
configuration flag for controlling rule boxing behavior in the parser.Refactor
Documentation
Bug Fixes