lukas-blecher / LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.
https://lukas-blecher.github.io/LaTeX-OCR/
MIT License
12.04k stars 991 forks source link

How to divide the crohme data set(CROHME.zip is in your google driver) into training data sets, validation data sets, and test data sets for testing ? #65

Closed aspnetcs closed 2 years ago

aspnetcs commented 2 years ago

How to divide the crohme data set(CROHME.zip is in your google driver) into training data sets, validation data sets, and test data sets for testing ?

lukas-blecher commented 2 years ago

Just randomly split up the files into three directories. I've done it like this.