can anyone share the trained model file which is genralized on any type of image like mathpix

da03 commented 4 years ago

Hi, thanks for your interest in our code, but the short answer is that we don't have such models since we don't have access to such data.

The long answer is that end-to-end learning is usually not generalizable to data unseen at training time. Mathpix used our code but they have their own internal dataset which is not open to the public. According to our preliminary experiments, we need at least 10K training images to get a reasonable performance using our approach, so to train such a model we might need hundreds of thousands of images (with LaTeX) of various styles/fonts.

vyaslkv commented 4 years ago

Thanks for replying!! is there any way they could (mathpix) could release the dataset

vyaslkv commented 4 years ago

can I get your email id so that we could discuss regarding getting such data set because what I am thinking might solve this

harvardnlp / im2markup

can anyone share the trained model file which is genralized on any type of image like mathpix #32