antlr / antlr-php-runtime

PHP Runtime for ANTLR4
BSD 3-Clause "New" or "Revised" License
84 stars 20 forks source link
antlr-runtime antlr4 parser-generator php8

ANTLR4 Runtime for PHP

Build Latest Stable Version License

First steps

1. Install ANTLR4

The getting started guide should get you started.

2. Install the PHP ANTLR runtime

Each target language for ANTLR has a runtime package for running parser generated by ANTLR4. The runtime provides a common set of tools for using your parser.

Install the runtime with Composer:

composer require antlr/antlr4-php-runtime

3. Generate your parser

You use the ANTLR4 "tool" to generate a parser. These will reference the ANTLR runtime, installed above.

Suppose you're using a UNIX system and have set up an alias for the ANTLR4 tool as described in the getting started guide. To generate your PHP parser, run the following command:

antlr4 -Dlanguage=PHP MyGrammar.g4

For a full list of antlr4 tool options, please visit the tool documentation page.

Complete example

Suppose you're using the JSON grammar from https://github.com/antlr/grammars-v4/tree/master/json.

Then, invoke antlr4 -Dlanguage=PHP JSON.g4. The result of this is a collection of .php files in the parser directory including:

JsonParser.php
JsonBaseListener.php
JsonLexer.php
JsonListener.php

Another common option to the ANTLR tool is -visitor, which generates a parse tree visitor, but we won't be doing that here. For a full list of antlr4 tool options, please visit the tool documentation page.

We'll write a small main func to call the generated parser/lexer (assuming they are separate). This one writes out the encountered ParseTreeContext's:

<?php

namespace JsonParser;

use Antlr\Antlr4\Runtime\CommonTokenStream;
use Antlr\Antlr4\Runtime\Error\Listeners\DiagnosticErrorListener;
use Antlr\Antlr4\Runtime\InputStream;
use Antlr\Antlr4\Runtime\ParserRuleContext;
use Antlr\Antlr4\Runtime\Tree\ErrorNode;
use Antlr\Antlr4\Runtime\Tree\ParseTreeListener;
use Antlr\Antlr4\Runtime\Tree\ParseTreeWalker;
use Antlr\Antlr4\Runtime\Tree\TerminalNode;

final class TreeShapeListener implements ParseTreeListener {
    public function visitTerminal(TerminalNode $node) : void {}
    public function visitErrorNode(ErrorNode $node) : void {}
    public function exitEveryRule(ParserRuleContext $ctx) : void {}

    public function enterEveryRule(ParserRuleContext $ctx) : void {
        echo $ctx->getText();
    }
}

$input = InputStream::fromPath($argv[1]);
$lexer = new JSONLexer($input);
$tokens = new CommonTokenStream($lexer);
$parser = new JSONParser($tokens);
$parser->addErrorListener(new DiagnosticErrorListener());
$parser->setBuildParseTree(true);
$tree = $parser->json();

ParseTreeWalker::default()->walk(new TreeShapeListener(), $tree);

Create a example.json file:

{"a":1}

Parse the input file:

php json.php example.json

The expected output is:

{"a":1}
{"a":1}
"a":1
1

Useful information

The Definitive ANTLR 4 Reference

Programmers run into parsing problems all the time. Whether it’s a data format like JSON, a network protocol like SMTP, a server configuration file for Apache, a PostScript/PDF file, or a simple spreadsheet macro language—ANTLR v4 and this book will demystify the process. ANTLR v4 has been rewritten from scratch to make it easier than ever to build parsers and the language applications built on top. This completely rewritten new edition of the bestselling Definitive ANTLR Reference shows you how to take advantage of these new features.

You can buy the book The Definitive ANTLR 4 Reference at amazon or an electronic version at the publisher's site.

You will find the Book source code useful.

Additional grammars

This repository is a collection of grammars without actions where the root directory name is the all-lowercase name of the language parsed by the grammar.