Open nikedinikedi opened 4 years ago
Similiar case with little different settings, but the main idea is same, when there's line break this bug happens:
Image 1. What i OCR'd
Image 2. Tex Editor
Doesn't seem to happen anymore
Try on longer, similar pieces as in the image
This is the main reason i don't use latex that often -Naess
I'll try the second PDF, but the test I shared was executed on the formula of the first PDF that you shared.
Also we're running different versions of the PDF. I don't recall exactly all the changes, but I think I changed something in the OCR a while back.
Ah wait, I see what you're doing here. You're OCRing everything, not just the formula. It's not meant to work like that. MathPix returns the LaTeX code for the whole image, meaning it thinks you also want the text to be part of the LaTeX document.
You need to include text not only the equation
Yes
It's inefficient to go one equation at once esp in math heavy papers jumping between words and equations. It works fine usually but not when there is a break like in the images
It's inefficient to go one equation at once esp in math heavy papers jumping between words and equations
I agree. It's not a bug though, it's a feature request as current functionality is working as expected.
Done, will be in next version. Look out for any bugs.
BEAUTIFUL
New description
When OCRing both Text and Formulas (math, chemistry, ...), add a feature to automatically convert LaTeX Math embeds
\[
and\]
as SMA tags[$]
and[/$]
.See below for illustration.
Old description
When OCR'ing chunk of certain layout with several latex symbols or equations, the equation isolated/seperated/in between two paragraphs will miss [$] and [/$]
This is what i OCR'd (the whole thing)
This has happened with multiple similiar layouts when i OCR such chunk. Note: if i OCR this equation individually, it works fine.
Here's the page of this PDF (the bug occurs with the equation in bottom)
apstats2.pdf