Implementing the mccode-3
language and targeting the mcstas
and mcxtrace
runtimes.
The traditional McCode
lex|flex
tokenizer and yacc|bison
parser
included in-rule code to implement some language features and called
the code-generator to construct the intermediate instrument source file.
The mixture of language parsing and multiple layers of generated functionality
made understanding the program operation, and debugging introduced errors,
difficult.
Worst of all, there is no easy-to-use tooling available to help the programmer
identify syntax errors on-the-fly.
This project reimplements the McCode
languages using ANTLR4
which both tokenizes and parses the language into a recursive descent parse tree.
ANTLR
can include extra in-rule parsing code, but since it can produce output
suited for multiple languages (and the extra code must be in the targeted language)
this feature is not implemented in this project.
Other benefits of ANTLR4
include integration with Integrated Development Environments,
including the freely available Community edition of PyCharm from JetBrains.
IDE integration can identify syntax mistakes in the language grammar files,
plus help to understand and debug language parsing.
Traditionally, McCode
identifies as a single language able to read, parse, and construct
programs to perform single particle statistical ray tracing.
While McCode-3
uses a single language.l
and language.y
file pair for lexing and parsing,
it actually implemented at least two related languages: one for component definitions in .comp
files,
one for instrument definitions in .instr
files,
plus arguably more for other specialised tasks.
This project makes use of ANTLR
's language dependency feature to separate the languages
into McComp
for components and McInstr
for instruments, with common language features
defined in a McCommon
grammar.
For use with the McCode
runtimes (McStas
and McXtrace
), the input languages must be
translated to C
following the C99
standard.
This translation was previously performed in C
since the lex|flex
, yacc|bison
workflow produced programs written in C
.
The C
programming language is a very good choice where execution speed is important,
as in the McCode
runtimes, but less so if speed is not the main goal and memory safety
or cross-platform development is important.
The McCode-3
translators do not always deallocate memory used in their runtime,
and newly developed features are likely to introduce unallocated, out-of-bounds, or double-free
memory errors which are then difficult to track down.
ANTLR4
is a Java
program, but produces parse-trees in multiple languages.
This project uses the Python
target so that language-translation can proceed in a language
which is well suited to new-feature development, while removing memory handling issues and
making cross-platform development significantly easier.
Install the latest development version from GitHub with
$ python -m pip install git+https://github.com/McStasMcXtrace/mccode-antlr.git
or the latest release from PyPI with
$ python -p pip install mccode_antlr
The mccode-antlr
package provides a command-line interface to the McCode
language parsers
and translators. To avoid shadowing the McCode-3
translators, the command-line interface
are suffixed with -antlr
.
$ mcstas-antlr --help
$ mcxtrace-antlr --help