breezedeus / Pix2Text-Mac

Pix2Text MacOS App: A Mac Desktop App for Mathematical Formula Recognition and Text Recognition. Mac ζœ¬εœ°ζ•°ε­¦ε…¬εΌθ―†εˆ«ε’Œζ–‡ζœ¬θ―†εˆ«ε·₯ε…·
https://p2t.breezedeus.com
MIT License
25 stars 2 forks source link
 
[![Discord](https://img.shields.io/discord/1200765964434821260?label=Discord)](https://discord.gg/GgD87WM8Tf) [![Visitors](https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fgithub.com%2Fbreezedeus%2FPix2Text-Mac&label=Visitors&countColor=%23ff8a65&style=flat&labelStyle=none)](https://visitorbadge.io/status?path=https%3A%2F%2Fgithub.com%2Fbreezedeus%2FPix2Text-Mac) [![license](https://img.shields.io/github/license/breezedeus/pix2text)](./LICENSE) [![stars](https://img.shields.io/github/stars/breezedeus/pix2text-mac)](https://github.com/breezedeus/Pix2Text-Mac) ![last-commit](https://img.shields.io/github/last-commit/breezedeus/Pix2Text-Mac) [![Twitter](https://img.shields.io/twitter/url?url=https%3A%2F%2Ftwitter.com%2Fbreezedeus)](https://twitter.com/breezedeus) [πŸ‘©πŸ»β€πŸ’» Pix2Text Online Service](https://p2t.breezedeus.com) | [πŸ‘¨πŸ»β€πŸ’» Pix2Text Online Demo](https://huggingface.co/spaces/breezedeus/Pix2Text-Demo) | [πŸ“– Online Doc](https://pix2text.readthedocs.io) | [πŸ’¬ Contact](https://www.breezedeus.com/article/join-group)
[δΈ­ζ–‡](./README_cn.md) | English

Pix2Text-Mac: A Mac desktop application for recognizing mathematical formulas

This project is a Mac local OCR application based on Pix2Text (no internet connection required). It can recognize mathematical formula images from the clipboard and convert them to their LaTeX representation, which can then be copied to the clipboard. Additionally, it supports text recognition (Text OCR) from general images.

Note ⚠️: This application is only available for MacOS.

The initial code of this project was forked from: horennel/LaTex-OCR_for_macOS. Special thanks to the author of this project.

Features

After opening the application, you can see the Pix2Text application icon in the Mac menu bar, as shown below. It includes OCR for 4 different modes.

1. Text_Formula OCR: Recognizing images with both formulas and text

This mode can recognize images containing both mathematical formulas and text. The recognition result is in Markdown format, which can be pasted into the Pix2Text Online Service to view the rendered result.

For example, it can recognize the following image (assets/mixed-en.jpg):

English mixed image

2. Formula OCR: Recognizing images with pure formulas

This mode can recognize images containing only mathematical formulas. The recognition result is in LaTeX format, which can be pasted into the Pix2Text Online Service to view the rendered result.

For example, it can recognize the following image (assets/math-formula-42.png):

English mixed image

3. Text OCR: Recognizing images with pure text

This mode can recognize images containing only text. The recognition result is in plain text.

For example, it can recognize the following image (assets/text.jpg):

English mixed image

4. Page OCR: Recognizing Page Screenshots with Complex Layouts

If an image contains complex layout structures, such as multi-column layouts or includes tables and other information, you can use this mode for recognition. This mode will additionally load the Layout Analysis and Table Recognition models from pix2text~=1.1 to recognize all information in the image and integrate the recognition results into Markdown format. You can paste the results into the Pix2Text web version to view the rendered results.

The recognition results will also be saved to a specified local folder. The folder location can be specified by the output_md_root_dir variable in the configuration file config.yaml, which defaults to the /tmp/output_mds folder. Additionally, the parsing results will be saved to a specified local folder. The folder location can be specified by the output_debug_dir variable in the configuration file config.yaml, which defaults to the /tmp/output_debugs folder. You can manually change the values of these two variables to specify the storage location.

For example, it can recognize the following image (assets/page.png):

English mixed image

Installation

1. Clone the repository:

git clone https://github.com/breezedeus/Pix2Text-Mac

2. Install dependencies:

pip install -r requirements.txt

If you want to recognize text images in languages other than Simplified Chinese and English, please run the following command to install additional dependencies:

pip install pix2text[multilingual]>=1.1.0.1

3. Verify the installation is working correctly

Use the following command to verify if the installed Pix2Text is working normally:

p2t predict -l en,ch_sim --resized-shape 768 --file-type page -i assets/page.png -o output-page --save-debug-res output-debug-page

4. Package the application:

python setup.py py2app -A

How to Use

Notes

Acknowledgments