hoang-quoc-trung / sumen

Scaling Up Image-to-LaTeX Performance: Sumen An End-to-End Transformer Model With Large Dataset
Apache License 2.0
11 stars 5 forks source link
crohme handwritten-mathematical-expression-recognition img2latex img2latex-100k latex-ocr printed-mathematical-expression-recognition

Translating Math Formula Images To LaTeX Sequences

Scaling Up Image-to-LaTeX Performance: Sumen An End-to-End Transformer Model With Large Dataset.

Performance

Setup

Uses

Available Model Checkpoint

We provide many Sumen model(base) - 349m params on Hugging Face, which can be downloaded at hoang-quoc-trung/sumen-base.

Training

python train.py --config_path src/config/base_config.yaml --resume_from_checkpoint true

arguments:
    -h, --help                   Show this help message and exit
    --config_path                Path to configuration file
    --resume_from_checkpoint     Continue training from saved checkpoint (true/false)

Inference

python inference.py --input_image assets/example_1.png --ckpt src/checkpoints

arguments:
    -h, --help                   Show this help message and exit
    --input_image                Path to image file
    --ckpt                       Path to the checkpoint model

Test

python test.py --config_path src/config/base_config.yaml --ckpt src/checkpoints

arguments:
    -h, --help                   Show this help message and exit
    --config_path                Path to configuration file
    --ckpt                       Path to the checkpoint model

Web Demo

streamlit run streamlit_app.py --ckpt src/checkpoints

arguments:
    -h, --help                   Show this help message and exit
    --ckpt                       Path to the checkpoint model

or

python gradio_app.py --ckpt src/checkpoints

arguments:
    -h, --help                   Show this help message and exit
    --ckpt                       Path to the checkpoint model

Dataset

Dataset is available here: Fusion Image To Latex Datasets

The directory data structure can look as follows:

Samples: image_filename latex
200922-1017-140.bmp \sqrt { \frac { c } { N } }
78cd39ce-71fc-4c86-838a-defa185e0020.jpg \lim_{w\to1}\cos{w}
KME2G3_19_sub_30.bmp \sum _ { i = 2 n + 3 m } ^ { 1 0 } i x
1d801f89870fb81_basic.png \sqrt { \varepsilon _ { \mathrm { L J } } / m \sigma ^ { 2 } }