gipplab / LaCASt

LaCASt - A LaTeX Translator for Computer Algebra Systems
MIT License
7 stars 1 forks source link

Made With JavaMade With LaTeX Tests Maintainability Test Coverage

LaCASt - A LaTeX Translator for Computer Algebra Systems

LaCASt is the first context-aware translator for mathematical LaTeX expressions. LaCASt includes natural language processing to analyze textual contexts, custom semantic LaTeX parser to analyze math inputs, and CAS interfaces (currently Maple and Mathematica) to compute and verify translated expressions automatically.

Publications

If you want to reference to this tool in general, please use the most recent publication from TPAMI 2023. If you want to refer to automatic evaluations only, use the 2nd latest publication in TACAS 2022.

A. Greiner-Petter, M. Schubotz, C. Breitinger, P. Scharpf, A. Aizawa, B. Gipp (2023) "Do the Math: Making Mathematics in Wikipedia Computable". In TPAMI 2023: 4384-4395 ```bibtex @Article{GreinerPetter23, author = {Andr{\'{e}} Greiner{-}Petter and Moritz Schubotz and Corinna Breitinger and Philipp Scharpf and Akiko Aizawa and Bela Gipp}, title = {Do the Math: Making Mathematics in Wikipedia Computable}, journal = {{IEEE} Trans. Pattern Anal. Mach. Intell.}, volume = {45}, number = {4}, pages = {4384--4395}, year = {2023}, url = {https://doi.org/10.1109/TPAMI.2022.3195261}, doi = {10.1109/TPAMI.2022.3195261}, timestamp = {Mon, 28 Aug 2023 21:37:38 +0200}, biburl = {https://dblp.org/rec/journals/pami/GreinerPetterSBSAG23.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } ```
A. Greiner-Petter, H. S. Cohl, A. Youssef, M. Schubotz, A. Trost, R. Dey, A. Aizawa, B. Gipp (2020) "Comparative Verification of the Digital Library of Mathematical Functions and Computer Algebra Systems". In TACAS 2022: 87-105 ```bibtex @InProceedings{Greiner-PetterC22, author = {Andr{\'{e}} Greiner{-}Petter and Howard S. Cohl and Abdou Youssef and Moritz Schubotz and Avi Trost and Rajen Dey and Akiko Aizawa and Bela Gipp}, title = {Comparative Verification of the Digital Library of Mathematical Functions and Computer Algebra Systems}, booktitle = {Tools and Algorithms for the Construction and Analysis of Systems - 28th International Conference, {TACAS} 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, {ETAPS} 2022, Munich, Germany, April 2-7, 2022, Proceedings, Part {I}}, series = {Lecture Notes in Computer Science}, volume = {13243}, pages = {87--105}, publisher = {Springer}, year = {2022}, url = {https://doi.org/10.1007/978-3-030-99524-9\_5}, doi = {10.1007/978-3-030-99524-9\_5} } ```
A. Greiner-Petter, M. Schubotz, H. S. Cohl, B. Gipp (2019) "Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems". In: Aslib Journal of Information Management. 71(3): 415-439 ```bibtex @Article{Greiner-Petter19, author = {Andr{\'{e}} Greiner{-}Petter and Moritz Schubotz and Howard S. Cohl and Bela Gipp}, title = {Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems}, journal = {Aslib Journal of Information Management}, volume = {71}, number = {3}, pages = {415--439}, year = {2019}, url = {https://doi.org/10.1108/AJIM-08-2018-0185}, doi = {10.1108/AJIM-08-2018-0185} } ```
H. S. Cohl, A. Greiner-Petter, M. Schubotz (2018) "Automated Symbolic and Numerical Testing of DLMF Formulae Using Computer Algebra Systems". In: CICM: 39-52 ```bibtex @InProceedings{Cohl18, author = {Howard S. Cohl and Andr{\'{e}} Greiner{-}Petter and Moritz Schubotz}, title = {Automated Symbolic and Numerical Testing of {DLMF} Formulae Using Computer Algebra Systems}, booktitle = {Intelligent Computer Mathematics - 11th International Conference, {CICM} 2018, Hagenberg, Austria, August 13-17, 2018, Proceedings}, series = {Lecture Notes in Computer Science}, volume = {11006}, pages = {39--52}, publisher = {Springer}, year = {2018}, url = {https://doi.org/10.1007/978-3-319-96812-4\_4}, doi = {10.1007/978-3-319-96812-4\_4} } ```
H. S. Cohl, M. Schubotz, A. Youssef, A. Greiner-Petter, J. Gerhard, B. V. Saunders, M. A. McClain, J. Bang, K. Chen (2017) "Semantic Preserving Bijective Mappings of Mathematical Formulae Between Document Preparation Systems and Computer Algebra Systems". In: CICM: 115-131 ```bibtex @InProceedings{Cohl17, author = {Howard S. Cohl and Moritz Schubotz and Abdou Youssef and Andr{\'{e}} Greiner{-}Petter and J{\"{u}}rgen Gerhard and Bonita V. Saunders and Marjorie A. McClain and Joon Bang and Kevin Chen}, title = {Semantic Preserving Bijective Mappings of Mathematical Formulae Between Document Preparation Systems and Computer Algebra Systems}, booktitle = {Intelligent Computer Mathematics - 10th International Conference, {CICM} 2017, Edinburgh, UK, July 17-21, 2017, Proceedings}, series = {Lecture Notes in Computer Science}, volume = {10383}, pages = {115--131}, publisher = {Springer}, year = {2017}, url = {https://doi.org/10.1007/978-3-319-62075-6\_9}, doi = {10.1007/978-3-319-62075-6\_9} } ```

How to use our program

The following provides a high level introduction on how to use the JARs and LaCASt in general. If you want to dive into the source code, we advice you to check our contribution guidelines first for more details on the structure.

The bin directory contains a couple of executable jars. Any of these programs require the lacast.config.yaml. Copy the config/template-lacast.config.yaml to the main directory and rename it to lacast.config.yaml. Afterward, update the entries in the template file to the properties that are applicable for you. LaCASt tries to load the config by following these rules:

  1. The system variable LACAST_CONFIG specifies the config location, e.g., export LACAST_CONFIG="path/to/lacast.config.yaml".
  2. The config file is in the current working directory.
  3. Loads the default config from the internal resources in the jar, see default config in interpreter.common/src/main/resources/

If none of the rules above point to a valid config, LaCASt stops with an error.


LaCASt contains several executable JARs as standalone applications. The following list explains the functionality of each JAR in more detail.

latex-to-cas-converter.jar: The forward translator (LaTeX -> CAS) --- The executable jar for the translator can be found in the `bin` subdirectory. A standalone version can be found in the `bin/*.zip` file. Unzip the archive where you want and run the jar from the root folder of the respository ```shell script java -jar bin/latex-to-cas-converter.jar ``` Without additional information, the jar runs as an interactive program. You can start the program to directly trigger the translation process or set further flags (every flag is optional): * `-CAS=`: Sets the computer algebra system you want to translate to, e.g., `-CAS=Maple` for Maple; * `-Expression=""`: Sets the expression you want to translate. Double qutation marks are mandatory; * `--clean` or `-c`: Only returns the translated expression without any other information. (since v1.0.1) * `--debug` or `-d`: Returns extra information for debugging, such as computation time and list of elements. (`--clean` overrides this setting). * `--extra` or `-x`: Shows further information about translation of functions, e.g., branch cuts, DLMF-links and more. (`--clean` flag overrides this setting) ---
lexicon-creator.jar: Maintain the translation dictionary --- Is used to maintain the internal translation dictionaries. Once the translation pattern is defined in the CSV files it must be trasformed to the dictionaries. The typical workflow is: ```shell script andre@agp:~$ java -jar bin/lexicon-creator.jar Welcome, this converter translates given CSV files to lexicon files. You didn't specified CSV files (do not add DLMFMacro.csv). Add a new CSV file and hit enter or enter '-end' to stop the adding process. all Current list: [CAS_Maple.csv, CAS_Mathematica.csv] -end ``` ---
maple-translator.jar: The backward translator for Maple (Maple -> Semantic LaTeX) --- This jar requires an installed Maple license on the machine! To start the translator, you have to set the environment variables to properly run Maple (see here [Building and Running a Java OpenMaple Application](https://de.maplesoft.com/support/help/maple/view.aspx?path=OpenMaple%2fJava%2frunning)) In my case, Maple is installed in `/opt/maple2019` and I'm on a Linux machine which requires to set `MAPLE` and `LD_LIBRARY_PATH`. In addition, you have to provide more heap size via `-Xss50M`, otherwise Maple crashes. Here is an example: ```shell script andre@agp:~$ export MAPLE="/opt/maple2019" andre@agp:~$ export LD_LIBRARY_PATH="/opt/maple2019/bin.X86_64_LINUX" andre@agp:~$ java -Xss50M -jar bin/maple-translator.jar ``` To get the Maple paths, you can start maple and enter the following commands: ``` kernelopts( bindir ); <- returns kernelopts( mapledir ); <- returns ``` ---
symbolic-tester.jar: Symbolic verification program --- This is only for advanced users! First, setup the properties: 1) `config/symbolic_tests.properties` Critical and required settings are: ```properties # the path to the dataset dlmf_dataset=/home/andreg-p/Howard/together.txt # the lines that should be tested in the provided dataset subset_tests=7209,7483 # the output path output=/home/andreg-p/Howard/Results/AutoMaple/22-JA-symbolic.txt # the output path for missing macros missing_macro_output=/home/andreg-p/Howard/Results/AutoMaple/22-JA-missing.txt ``` 2) `symbolic-tester.jar` program arguments: * `-maple` to run the tests with Maple * `-mathematica` to run the tests with Mathematica (you can only specify one at a time, maple or mathematica) * `-Xmx8g` increase the java memory, that's not required but useful * `-Xss50M` increase the heap size if you use Maple Additionally, you have to set environment variables if you work with Maple (see the `maple-translator.jar` instructions above for more details about required variables). 3) Since you may want to run automatically evaluations on subsets, you can use the `scripts/symbolic-evaluator.sh`. Of course you need to update the paths in the script. With `config/together-lines.txt` you can control what subsets the script shall evaluate, e.g., ``` 04-EF: 1465,1994 05-GA: 1994,2179 ``` The second argument is excluded (i.e., `1,2` runs only one line, `1` but not `2`). To test the lines `1465-1994` and `1994-2179` and store the results in `04-EF-symbolic.txgt` and `05-GA-symbolic.txt` file. ---
numeric-tester.jar: Numeric verification program --- This is only for advanced users! First, setup the properties: 1) `config/numerical_tests.properties` Critical and required settings are: ```properties # the path to the dataset dlmf_dataset=/home/andreg-p/Howard/together.txt # either you define a subset of lines to test or you define the results file of symbolic evaluation, which is recommended # subset_tests=7209,7483 symbolic_results_data=/home/andreg-p/Howard/Results/AutoMath/11-ST-symbolic.txt # the output path output=/home/andreg-p/Howard/Results/MathNumeric/11-ST-numeric.txt ``` 2) `numeric-tester.jar` program arguments: * `-maple` to run the tests with Maple * `-mathematica` to run the tests with Mathematica * `-Xmx8g` increase the java memory, that's not required but useful * `-Xss50M` increase the heap size if you use Maple 3) Since you may want to run automatically evaluations on subsets, you can use the `scripts/numeric-evaluator.sh`. Of course you need to update the paths in the script. With `config/together-lines.txt` you can control what subsets the script shall evaluate, e.g., ``` 04-EF: 1465,1994 05-GA: 1994,2179 ``` This will automatically load the symbolic result files `04-EF-symbolic.txg` and `05-GA-symbolic.txt` and start the evaluation. ---

Update Translation Patterns

The translation patterns are defined in libs/ReferenceData/CSVTables. If you wish to add translation patterns you need to compile the changes before the translator can use them. To update the translations, use the lexicon-creator.jar (see the explanations above).

Update Pre-Processing Replacement Rules

The pre-processing replacement rules are defined in config/replacements.yml and config/dlmf-replacements.yml. Each config contains further explanations how to add replacement rules. The replacement rules are applied without further compilation. Just change the files to add, modify, or remove rules.

Contributors

Role Name Contact
Main Developer André Greiner-Petter greinerpetter (at) wuppertal.de
Supervisor Dr. Howard Cohl howard.cohl (at) nist.gov
Advisor Dr. Moritz Schubotz schubotz (at) uni-wuppertal.de
Advisor Prof. Abdou Youssef abdou.youssef (at) nist.gov
Student Developers Avi Trost, Rajen Dey, Claude, Jagan