olmobaldoni / logseq-formula-ocr-plugin

This Logseq plugin is designed to transform LaTex formula images from the clipboard into LaTex code using Transformers.
MIT License
10 stars 0 forks source link

Not giving the correct/ complete LaTeX #1

Closed Thiago-Assis-T closed 2 months ago

Thiago-Assis-T commented 5 months ago

Hello, i installed the plugin, however it didn't get one single LaTeX formula correctly: examples: image: image output: $$j(\varepsilon)={\frac{1}{\sigma\sqrt{$$ image: image

output:

$$v2(g)={\frac{(2-3)^{2$$

it repeats with simpler or more complex examples...

I installed the plugin from inside logseq, on nixos-unstable and logseq 0.10.8 feel free to ask more questions and request info.

edit: forgot to put the outputs on code blocks and it didn't render properly

olmobaldoni commented 5 months ago

Hi! I think the problem is that Nougat is a model obtained from training on classic LaTeX documents, where the mathematical formulas are on a white background. Since you are inferring images with a black background, this could cause problems.

Do you also have this problem with formula images with a white background? When I can't get the correct LaTeX code, I generally try to use screenshots of different sizes

If you have any other doubts, don't hesitate to write

Thiago-Assis-T commented 5 months ago

I didn't think of that, holdup, let me test the hypothesis

Thiago-Assis-T commented 5 months ago

yeah,kinda the same thing...

image

$$\Delta f=\nabla^{2}f=\nabla\cdot\nabla$$

image

$$\int_{a}^{b}f^{\prime}(t$$

now they are just incomplete...

also, the image size thing, doesn't make a difference at all, the results are always the same for the formula, independent of image size...

Thiago-Assis-T commented 5 months ago

I tried taking a look at hugging face for some logs... found nothing, is there such thing?

zhuo121 commented 5 months ago

I have the same problem, when I copy a long formula, it only shows the first part of the long formula, and the later part does not output properly image $${\mathfrak{Z}}{i}(x{i}) : =\left{\sum{x{i}^{\$$

olmobaldoni commented 5 months ago

The author of the model has opened an issue here in which he reports the problem. It seems that HF's inference API cuts off the answer. The problem is also reported here

zhuo121 commented 5 months ago

Thanks a lot. It seems that this problem haven't been solved yet and still need some time.

Thiago-Assis-T commented 5 months ago

Thanks a lot olmobaldoni for the reply, i'll check the issues you cited

olmobaldoni commented 2 months ago

After a few months, I found the time to implement a workaround using Docker. It is now possible to run a Docker container that uses the original model without relying on the Hugging Face inference API. I have also made some changes to the commands used in Logseq.

Since the issue with Hugging Face does not appear to have been resolved yet, I will be closing this issue here.