facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.98k stars 567 forks source link

Mmd file marttext cannot be read, mathematical formulas cannot be displayed correctly #76

Closed smclh closed 1 year ago

smclh commented 1 year ago

Hello, thank you very much for sharing the project.

I have some doubts. I am unable to use the markdown editor to open the generated mmd file (I am using marktext and have already added - markdown) and copy the content into it, but the mathematical formula is still unrecognized (cannot be displayed correctly as a formula). How can I handle this issue (I tried Mathmix, but I prefer to use Python to handle them)

lukas-blecher commented 1 year ago
  1. Try to run with the --markdown flag
  2. Try to replace the [ , ( delimiters with $$, $
marwinsteiner commented 1 year ago

Hello, thank you very much for sharing the project.

I have some doubts. I am unable to use the markdown editor to open the generated mmd file (I am using marktext and have already added - markdown) and copy the content into it, but the mathematical formula is still unrecognized (cannot be displayed correctly as a formula). How can I handle this issue (I tried Mathmix, but I prefer to use Python to handle them)

@smclh you can also get an old version of Typora, back when they didn't require a software key. Then, as Lukas suggested if you need the dollarsign math delimeters you can do a simple regex substitution.

uneetkumarsingh commented 1 year ago

You can use mathpix mmd plugin in vscode. https://marketplace.visualstudio.com/items?itemName=mathpix.vscode-mathpix-markdown

marwinsteiner commented 1 year ago

@uneetkumarsingh too bad mathpix isn't free anymore.

uneetkumarsingh commented 1 year ago

@marwinsteiner Mathpix isn't free, true that, and it perhaps never was. But it's viewer is free. You can use that viewer to view/render the nougat mmd files. I have used it with nougat files myself image

marwinsteiner commented 1 year ago

Sure, but what's the added value vs. Other viewers like Typora? No need for Regex substitution?

lukas-blecher commented 1 year ago

For me it's just that VSCode is my default editor for anything text related. So +1 for the MMD extension Also, tables. I'm not familiar with Typora, but nougat will output latex tables only. They are likely not supported by most other markdown editors. But there are some interfering features, like converting (c) to ©.

uneetkumarsingh commented 1 year ago

@marwinsteiner Yes, No regex substitution needed. nougat mmd is mathpix Compatibale. Here from the nougat readme:

the lightweight markup language, mostly compatible with Mathpix Markdown (we make use of the LaTeX tables).

and just checked. Isn't Typora also paid?

marwinsteiner commented 1 year ago

@uneetkumarsingh that's very cool, I'll have to try it out!

marwinsteiner commented 1 year ago

@lukas-blecher maybe you could add a flag like --delimeter-dollar and all math, inline and blocks will have the dollar delimeter around them, or --delimeter-braces would simply mean this [] as a math delimeter to eliminate the need to post-process the output with a regex substitution? As far as I can tell this shouldn't be too difficult to implement?