microsoft / Trieste

A term rewriting system for experimental programming language development.
MIT License
37 stars 20 forks source link

Added Shrubbery parser as a sample #110

Open EliasC opened 4 months ago

EliasC commented 4 months ago

This PR adds a sample that parses Shrubbery Notation. It complements the infix example by having a slightly more involved parsing pass (whereas the three rewriting passes are quite straightforward). In particular, it gives an example of handling indentation sensitivity. There is a README that gives an overview of Shrubbery parsing and the implementation. The source code has explanatory comments, but assumes that the reader has done the infix tutorial and therefore understands the basics of parsing and pattern matching.

The implementation relies on #90.

EliasC commented 4 months ago

@microsoft-github-policy-service agree

matajoh commented 3 months ago

Small question: where do the test files come from? If they have been copied in from another repo, could we potentially clone that repo as part of the build. to avoid having all of these in the repo? Not a big deal if not, just curious. You can see an example of what I'm talking about in the parser directory:

if(TRIESTE_BUILD_PARSER_TESTS)
    enable_testing()
    add_subdirectory(test)

    if(NOT IS_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/JSONTestSuite)
    execute_process(COMMAND ${GIT_EXECUTABLE} clone --depth=1 https://github.com/nst/JSONTestSuite
                    WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
                    OUTPUT_QUIET)
    endif()

    if(NOT IS_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/yaml-test-suite)
    execute_process(COMMAND ${GIT_EXECUTABLE} clone --depth=1 --branch data-2022-01-17 https://github.com/yaml/yaml-test-suite
                    WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
                    OUTPUT_QUIET)
    endif()
endif()

It would also be neat for these tests to be added either via a test driver like json_test.cc or by adding them in CMake:

add_test(NAME json_trieste COMMAND json_trieste test -f WORKING_DIRECTORY $<TARGET_FILE_DIR:json_trieste>)

Anything to have more robust CI :)

EliasC commented 3 months ago

Small question: where do the test files come from? If they have been copied in from another repo, could we potentially clone that repo as part of the build. to avoid having all of these in the repo?

I wrote them myself. Some of them come from the documentation of shrubbery notation. I have been looking for an existing test suite without success (also there would need to be a filtering process since I don't support all of Shrubbery notation).

matajoh commented 3 months ago

Small question: where do the test files come from? If they have been copied in from another repo, could we potentially clone that repo as part of the build. to avoid having all of these in the repo?

I wrote them myself. Some of them come from the documentation of shrubbery notation. I have been looking for an existing test suite without success (also there would need to be a filtering process since I don't support all of Shrubbery notation).

OK in that case I suppose we can put them here for now. Maybe one potential improvement down the line would be to put them in a YAML file and write a test driver to process them.

EliasC commented 2 months ago

@matajoh I have addressed most of your comments now.

I also added a final rewriting pass that takes the tree to a shape that matches the grammar of shrubbery notation (this is presented in shrubbery.h). Later it might make sense to add a writer that outputs the parsed program as an S-expression (or indeed as shrubbery notation), but for now I think it works as an example to learn from.