LaCASt - A LaTeX Translator for Computer Algebra Systems
LaCASt is the first context-aware translator for mathematical LaTeX expressions. LaCASt includes natural language processing to analyze textual contexts, custom semantic LaTeX parser to analyze math inputs, and CAS interfaces (currently Maple and Mathematica) to compute and verify translated expressions automatically.
Publications
If you want to reference to this tool in general, please use the most recent publication from TPAMI 2023.
If you want to refer to automatic evaluations only, use the 2nd latest publication in TACAS 2022.
A. Greiner-Petter, M. Schubotz, C. Breitinger, P. Scharpf, A. Aizawa, B. Gipp (2023) "Do the Math: Making Mathematics in Wikipedia Computable". In TPAMI 2023: 4384-4395
```bibtex
@Article{GreinerPetter23,
author = {Andr{\'{e}} Greiner{-}Petter and
Moritz Schubotz and
Corinna Breitinger and
Philipp Scharpf and
Akiko Aizawa and
Bela Gipp},
title = {Do the Math: Making Mathematics in Wikipedia Computable},
journal = {{IEEE} Trans. Pattern Anal. Mach. Intell.},
volume = {45},
number = {4},
pages = {4384--4395},
year = {2023},
url = {https://doi.org/10.1109/TPAMI.2022.3195261},
doi = {10.1109/TPAMI.2022.3195261},
timestamp = {Mon, 28 Aug 2023 21:37:38 +0200},
biburl = {https://dblp.org/rec/journals/pami/GreinerPetterSBSAG23.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
A. Greiner-Petter, H. S. Cohl, A. Youssef, M. Schubotz, A. Trost, R. Dey, A. Aizawa, B. Gipp (2020) "Comparative Verification of the Digital Library of Mathematical Functions and Computer Algebra Systems". In TACAS 2022: 87-105
```bibtex
@InProceedings{Greiner-PetterC22,
author = {Andr{\'{e}} Greiner{-}Petter and
Howard S. Cohl and
Abdou Youssef and
Moritz Schubotz and
Avi Trost and
Rajen Dey and
Akiko Aizawa and
Bela Gipp},
title = {Comparative Verification of the Digital Library of Mathematical Functions
and Computer Algebra Systems},
booktitle = {Tools and Algorithms for the Construction and Analysis of Systems
- 28th International Conference, {TACAS} 2022, Held as Part of the
European Joint Conferences on Theory and Practice of Software, {ETAPS}
2022, Munich, Germany, April 2-7, 2022, Proceedings, Part {I}},
series = {Lecture Notes in Computer Science},
volume = {13243},
pages = {87--105},
publisher = {Springer},
year = {2022},
url = {https://doi.org/10.1007/978-3-030-99524-9\_5},
doi = {10.1007/978-3-030-99524-9\_5}
}
```
A. Greiner-Petter, M. Schubotz, H. S. Cohl, B. Gipp (2019) "Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems". In: Aslib Journal of Information Management. 71(3): 415-439
```bibtex
@Article{Greiner-Petter19,
author = {Andr{\'{e}} Greiner{-}Petter and
Moritz Schubotz and
Howard S. Cohl and
Bela Gipp},
title = {Semantic preserving bijective mappings for expressions involving special
functions between computer algebra systems and document preparation
systems},
journal = {Aslib Journal of Information Management},
volume = {71},
number = {3},
pages = {415--439},
year = {2019},
url = {https://doi.org/10.1108/AJIM-08-2018-0185},
doi = {10.1108/AJIM-08-2018-0185}
}
```
H. S. Cohl, A. Greiner-Petter, M. Schubotz (2018) "Automated Symbolic and Numerical Testing of DLMF Formulae Using Computer Algebra Systems". In: CICM: 39-52
```bibtex
@InProceedings{Cohl18,
author = {Howard S. Cohl and
Andr{\'{e}} Greiner{-}Petter and
Moritz Schubotz},
title = {Automated Symbolic and Numerical Testing of {DLMF} Formulae Using
Computer Algebra Systems},
booktitle = {Intelligent Computer Mathematics - 11th International Conference,
{CICM} 2018, Hagenberg, Austria, August 13-17, 2018, Proceedings},
series = {Lecture Notes in Computer Science},
volume = {11006},
pages = {39--52},
publisher = {Springer},
year = {2018},
url = {https://doi.org/10.1007/978-3-319-96812-4\_4},
doi = {10.1007/978-3-319-96812-4\_4}
}
```
H. S. Cohl, M. Schubotz, A. Youssef, A. Greiner-Petter, J. Gerhard, B. V. Saunders, M. A. McClain, J. Bang, K. Chen (2017) "Semantic Preserving Bijective Mappings of Mathematical Formulae Between Document Preparation Systems and Computer Algebra Systems". In: CICM: 115-131
```bibtex
@InProceedings{Cohl17,
author = {Howard S. Cohl and
Moritz Schubotz and
Abdou Youssef and
Andr{\'{e}} Greiner{-}Petter and
J{\"{u}}rgen Gerhard and
Bonita V. Saunders and
Marjorie A. McClain and
Joon Bang and
Kevin Chen},
title = {Semantic Preserving Bijective Mappings of Mathematical Formulae Between
Document Preparation Systems and Computer Algebra Systems},
booktitle = {Intelligent Computer Mathematics - 10th International Conference,
{CICM} 2017, Edinburgh, UK, July 17-21, 2017, Proceedings},
series = {Lecture Notes in Computer Science},
volume = {10383},
pages = {115--131},
publisher = {Springer},
year = {2017},
url = {https://doi.org/10.1007/978-3-319-62075-6\_9},
doi = {10.1007/978-3-319-62075-6\_9}
}
```
How to use our program
The following provides a high level introduction on how to use the JARs and LaCASt in general. If you want to dive into the source code, we advice you to check our contribution guidelines first for more details on the structure.
The bin
directory contains a couple of executable jars. Any of these programs require the lacast.config.yaml
.
Copy the config/template-lacast.config.yaml
to the main directory and rename it to lacast.config.yaml
. Afterward,
update the entries in the template file to the properties that are applicable for you.
LaCASt tries to load the config by following these rules:
- The system variable
LACAST_CONFIG
specifies the config location, e.g., export LACAST_CONFIG="path/to/lacast.config.yaml"
.
- The config file is in the current working directory.
- Loads the default config from the internal resources in the jar, see default config in
interpreter.common/src/main/resources/
If none of the rules above point to a valid config, LaCASt stops with an error.
LaCASt contains several executable JARs as standalone applications. The following list explains the functionality of each JAR in more detail.
latex-to-cas-converter.jar
: The forward translator (LaTeX -> CAS)
---
The executable jar for the translator can be found in the `bin` subdirectory. A standalone version can be found in the `bin/*.zip` file. Unzip the archive where you want and run the jar from the root folder of the respository
```shell script
java -jar bin/latex-to-cas-converter.jar
```
Without additional information, the jar runs as an interactive program. You can start the program to directly trigger
the translation process or set further flags (every flag is optional):
* `-CAS=`: Sets the computer algebra system you want to translate to, e.g., `-CAS=Maple` for Maple;
* `-Expression=""`: Sets the expression you want to translate. Double qutation marks are mandatory;
* `--clean` or `-c`: Only returns the translated expression without any other information. (since v1.0.1)
* `--debug` or `-d`: Returns extra information for debugging, such as computation time and list of elements. (`--clean` overrides this setting).
* `--extra` or `-x`: Shows further information about translation of functions, e.g., branch cuts, DLMF-links and more. (`--clean` flag overrides this setting)
---
lexicon-creator.jar
: Maintain the translation dictionary
---
Is used to maintain the internal translation dictionaries. Once the translation pattern is defined in the CSV files it must be trasformed to the dictionaries. The typical workflow is:
```shell script
andre@agp:~$ java -jar bin/lexicon-creator.jar
Welcome, this converter translates given CSV files to lexicon files.
You didn't specified CSV files (do not add DLMFMacro.csv).
Add a new CSV file and hit enter or enter '-end' to stop the adding process.
all
Current list: [CAS_Maple.csv, CAS_Mathematica.csv]
-end
```
---
maple-translator.jar
: The backward translator for Maple (Maple -> Semantic LaTeX)
---
This jar requires an installed Maple license on the machine! To start the translator,
you have to set the environment variables to properly run Maple (see here [Building and Running a Java OpenMaple Application](https://de.maplesoft.com/support/help/maple/view.aspx?path=OpenMaple%2fJava%2frunning))
In my case, Maple is installed in `/opt/maple2019` and I'm on a Linux machine which requires to set `MAPLE` and `LD_LIBRARY_PATH`.
In addition, you have to provide more heap size via `-Xss50M`, otherwise Maple crashes. Here is an example:
```shell script
andre@agp:~$ export MAPLE="/opt/maple2019"
andre@agp:~$ export LD_LIBRARY_PATH="/opt/maple2019/bin.X86_64_LINUX"
andre@agp:~$ java -Xss50M -jar bin/maple-translator.jar
```
To get the Maple paths, you can start maple and enter the following commands:
```
kernelopts( bindir ); <- returns
kernelopts( mapledir ); <- returns
```
---
symbolic-tester.jar
: Symbolic verification program
---
This is only for advanced users! First, setup the properties:
1) `config/symbolic_tests.properties`
Critical and required settings are:
```properties
# the path to the dataset
dlmf_dataset=/home/andreg-p/Howard/together.txt
# the lines that should be tested in the provided dataset
subset_tests=7209,7483
# the output path
output=/home/andreg-p/Howard/Results/AutoMaple/22-JA-symbolic.txt
# the output path for missing macros
missing_macro_output=/home/andreg-p/Howard/Results/AutoMaple/22-JA-missing.txt
```
2) `symbolic-tester.jar` program arguments:
* `-maple` to run the tests with Maple
* `-mathematica` to run the tests with Mathematica (you can only specify one at a time, maple or mathematica)
* `-Xmx8g` increase the java memory, that's not required but useful
* `-Xss50M` increase the heap size if you use Maple
Additionally, you have to set environment variables if you work with Maple (see the `maple-translator.jar` instructions
above for more details about required variables).
3) Since you may want to run automatically evaluations on subsets, you can use the `scripts/symbolic-evaluator.sh`. Of course you need to update the paths in the script. With `config/together-lines.txt` you can control what subsets the script shall evaluate, e.g.,
```
04-EF: 1465,1994
05-GA: 1994,2179
```
The second argument is excluded (i.e., `1,2` runs only one line, `1` but not `2`).
To test the lines `1465-1994` and `1994-2179` and store the results in `04-EF-symbolic.txgt` and `05-GA-symbolic.txt` file.
---
numeric-tester.jar
: Numeric verification program
---
This is only for advanced users! First, setup the properties:
1) `config/numerical_tests.properties`
Critical and required settings are:
```properties
# the path to the dataset
dlmf_dataset=/home/andreg-p/Howard/together.txt
# either you define a subset of lines to test or you define the results file of symbolic evaluation, which is recommended
# subset_tests=7209,7483
symbolic_results_data=/home/andreg-p/Howard/Results/AutoMath/11-ST-symbolic.txt
# the output path
output=/home/andreg-p/Howard/Results/MathNumeric/11-ST-numeric.txt
```
2) `numeric-tester.jar` program arguments:
* `-maple` to run the tests with Maple
* `-mathematica` to run the tests with Mathematica
* `-Xmx8g` increase the java memory, that's not required but useful
* `-Xss50M` increase the heap size if you use Maple
3) Since you may want to run automatically evaluations on subsets, you can use the `scripts/numeric-evaluator.sh`. Of course you need to update the paths in the script. With `config/together-lines.txt` you can control what subsets the script shall evaluate, e.g.,
```
04-EF: 1465,1994
05-GA: 1994,2179
```
This will automatically load the symbolic result files `04-EF-symbolic.txg` and `05-GA-symbolic.txt` and start the evaluation.
---
Update Translation Patterns
The translation patterns are defined in libs/ReferenceData/CSVTables
. If you wish to add translation patterns you need to
compile the changes before the translator can use them. To update the translations, use the lexicon-creator.jar
(see the explanations above).
Update Pre-Processing Replacement Rules
The pre-processing replacement rules are defined in config/replacements.yml
and config/dlmf-replacements.yml
. Each config
contains further explanations how to add replacement rules. The replacement rules are applied without further compilation.
Just change the files to add, modify, or remove rules.
Contributors