ml4ai / ASKEM-TA1-DockerVM

Docker recipes demonstrating how to use our pipelines
3 stars 1 forks source link


Place to create docker recepies that will make our pipelines easy to use


The current directory must contain subdirectories inputs and outputs. Inputs will have plain text files to be processed by TA 1 reading pipelines Edit docker-compose.yml to add your OpenAI API key.

Run docker compose up and after both pipelines have finished, the outputs directory will contain files with the following prefiexes:

SKEMA Service Components

Text Reading

The client code for both SKEMA and MIT text reading pipelines is available in end-to-end-rest/notebooks/text_reading_pipeline.ipynb

This notebook contains examples of how to annotate:

Additionally, the variable extraction endpoints support an optional AMR file that will be linked with the variables extracted at the end of extraction. See the PDF annotation example for reference.

AMR alignment

The notebook end-to-end-rest/notebooks/text_reading/metal.ipynb contains an example of how to call the AMR linking endpoint if you have a file with variable extractions and a pre-existing AMR.



There are two endpoints available for this part of the workflow, which are demonstrated in the equations.ipynb notebook located in the end-to-end-rest/notebooks directory.

  1. get("/latex/mml")

This endpoint handles a GET request and expects a LaTeX string representing an equation as input. It then returns the corresponding presentation MathML for that equation.

  1. post("/image/mml")

This endpoint handles a POST request and expects a PNG image of an equation as input. It then processes the image and returns the corresponding presentation MathML for that equation.

Please refer to the equations.ipynb notebook for a detailed demonstration of how to use these endpoints.


There are several endpoints that can be used for this aspect of the workflow. They are demonstrated in the eqn2amr.ipynb notebook in the end-to-end-rest/notebooks directory.

  1. post("/workflows/consolidated/equations-to-amr")

    Is a put request takes in a vector of mathml or LaTeX strings and returns an AMR of the selected variety, either Petrinet, RegNet, gAMR, MET, or Decapode.

    An example input for a regnet below:

    "mathml": [
    "model": "regnet"



The Code2FN service take code as input (in multiple different forms), runs the program analysis pipeline to parse the files into CAST and translate the CAST into a Function Network (FN) and returns Gromet Function Network Module Collection (GrometFNModuleCollection) JSON.

The service currently accepts Python and Fortran (family) source code. The language type is determined by the filename extensions:

The service can accept the following types of code forms:

The two endpoints, as well as the expected structure of the JSON serialized code system, are demonstrated in the two notebooks end-to-end-rest/notebooks/code2fn/fn-given-filepaths.ipynb and end-to-end-rest/notebooks/code2fn/fn-given-filepaths-zip.ipynb


This is demonstrated in the code2amr.ipynb notebook in the end-to-end-rest/notebooks directory.

This part of the workflow currently has two endpoints, one for code-snippets and one for code archives (.zip files), however only PetriNet's are primarily supported right now AMR extractions. The code-snippet workflow is accessed through the following endpoint:


The endpoint to take in a code archive is the following:


AMR Refinement

This is demonstrated in the MORAE_demo.ipynb notebook in the morae-demo/ directory.

Enrique, if you could explain how to run it here.