chungkwong / mathocr-tap

Offline handwritten mathematical expression recognition via stroke extraction and TAP
20 stars 7 forks source link

Offline handwritten mathematical expression recognition via stroke extraction and TAP

The purpose of this repository is to provide a trainable online handwritten mathematical expression recognition system, which can be used with a stroke extractor to do offline handwritten mathematical expression recognition.

Usage

  1. Ensure that Perl, Java, pandoc, Theano and libgpuarray are installed
  2. Clone this repository: git clone 'https://github.com/chungkwong/mathocr-tap.git'
  3. Change directory: cd mathocr-tap/work/src
  4. Train a model(optional, a pretrained model is included): ./train.sh && ./train_weightnoise.sh
  5. Test: ./test.sh
  6. Recognize your images: ./recognize.sh IMAGE_FILE IMAGE_FILE...

Structure

Accuracy

Here are accuracy of some offline handwritten mathematical expression recognition systems on the test set of CROHME 2016.

System Exact <=1 error <= 2 errors Structural correct Remark
USTC, WAP 42.0% 55.1% 59.3% - Ensemble modeling is applied(5 models)
Stroke extractor + TAP 43.07% 56.67% 62.95% 64.95%
TDTU, CNN-BLSTM-LSTM 45.60% 59.29% 65.65% - Data augmentation is applied(36.27% before data augmentation)
USTC, MSD 50.1% 63.8% 67.4% - Ensemble modeling is applied(5 models)

It should be noted that online accuracy of this version of TAP is 43.68%, which is close to its offline counterpart. The point is that if we have trained an online recognizer with extracted strokes, we can obtain an offline recognizer which is nearly as good as it.

References

If you are interested in online mathematical expression recognition, you can read Track, attend and parse (TAP): An end-to-end framework for online handwritten mathematical expression recognition:

@article{zhang2018track,
  title={Track, Attend and Parse (TAP): An End-to-end Framework for Online Handwritten Mathematical Expression Recognition},
  author={Zhang, Jianshu and Du, Jun and Dai, Lirong},
  journal={IEEE Transactions on Multimedia},
  year={2018},
  publisher={IEEE}
}

If you are interested in stroke extraction, you can read Stroke extraction for offline handwritten mathematical expression recognition:

@ARTICLE{9051736,
  author={C. {Chan}},
  journal={IEEE Access}, 
  title={Stroke Extraction for Offline Handwritten Mathematical Expression Recognition}, 
  year={2020},
  volume={8},
  pages={61565-61575},
  doi={10.1109/ACCESS.2020.2984627}}

基于笔划提取和TAP的脱机手写数学公式识别

本仓库的目的是提供一个可训练的联机手写数学公式识别系统,配合一个笔划提取器,可以用于打造一个脱机手写数学公式识别系统。

用法

  1. 确保Perl,Java,pandoc,Theanolibgpuarray已安装好
  2. 克隆本仓库:git clone 'https://github.com/chungkwong/mathocr-tap.git'
  3. 进入代码目录:cd mathocr-tap/work/src
  4. 训练模型(可选,因为提供了预训练模型):./train.sh && ./train_weightnoise.sh
  5. 测试准确度:./test.sh
  6. 识别你提供的图片:./recognize.sh 图片 图片...

文件结构

准确度

以下是比较本系统和其它脱机手写数学公式识别系统在CROHME 2016测试集上的准确率:

系统 完全正确 至多一个错误 至多两个错误 结构正确 注记
USTC, WAP 42.0% 55.1% 59.3% - 组合了五个模型
笔划提取 + TAP 43.07% 56.67% 62.95% 64.95%
TDTU, CNN-BLSTM-LSTM 45.60% 59.29% 65.65% - 使用了扩充数据集(原数据集上为36.27%)
USTC, MSD 50.1% 63.8% 67.4% - 组合了五个模型

值得注意的是,这版本TAP的联机识别准确率为43.68%,与脱机识别准确率相若。这表明通过用提取出的笔划去训练一个联机识别系统,可以得到一个准确度与之相当的脱机识别系统。

参考资料

如果你对联机手写数学公式识别系统感兴趣,请参阅Track, attend and parse (TAP): An end-to-end framework for online handwritten mathematical expression recognition

@article{zhang2018track,
  title={Track, Attend and Parse (TAP): An End-to-end Framework for Online Handwritten Mathematical Expression Recognition},
  author={Zhang, Jianshu and Du, Jun and Dai, Lirong},
  journal={IEEE Transactions on Multimedia},
  year={2018},
  publisher={IEEE}
}

如果你对笔划提取算法感兴趣,请参阅Stroke extraction for offline handwritten mathematical expression recognition:

@ARTICLE{9051736,
  author={C. {Chan}},
  journal={IEEE Access}, 
  title={Stroke Extraction for Offline Handwritten Mathematical Expression Recognition}, 
  year={2020},
  volume={8},
  pages={61565-61575},
  doi={10.1109/ACCESS.2020.2984627}}