UppuluriKalyani / ML-Nexus

ML Nexus is an open-source collection of machine learning projects, covering topics like neural networks, computer vision, and NLP. Whether you're a beginner or expert, contribute, collaborate, and grow together in the world of AI. Join us to shape the future of machine learning!
https://ml-nexus.vercel.app/
MIT License
67 stars 122 forks source link

Text Extractor from images #843

Open Somyajain2004 opened 6 days ago

Somyajain2004 commented 6 days ago

Is your feature request related to a problem? Please describe. The problem is the need for an efficient and accurate way to extract text from images. This can be particularly useful in scenarios where users want to convert printed or handwritten text in images (like receipts, documents, or notes) into editable and searchable digital text. Currently, manual text extraction is time-consuming and error-prone.

Describe the solution you'd like The solution should allow users to upload an image, and the system will automatically recognize and extract text, converting it into an editable format like plain text, or PDF, The extracted text should be accurate and retain formatting where possible, making it usable for further processing or data entry.

Describe alternatives you've considered -Third-party OCR Tools: While external OCR (Optical Character Recognition) tools like Tesseract or Google Vision API are available, they require additional setup, API integration, and, in some cases, incur costs. An in-built solution would streamline the process and enhance user experience.

Approach to be followed (optional) -Use a pre-trained OCR model (e.g., Tesseract OCR) to recognize text from uploaded images. -Build a simple user interface for users to upload images and view extracted text results. -Implement text-editing and export options (e.g., save as .txt, .pdf, or .csv).

Additional context Expected results : Input: image

Output : Textual Conventions (I)

MediumType, MediumAddress ethernet(7), tokenring(9), fddi(15) PeerType, PeerAddress ipv4(1), ipv6(2), nsap(3), ipx(11), appletalk(12), decnet(13) AdjacentType, AdjacentAddress A superset of MediumType and PeerType RTFM WG 3 The University of Auckland

github-actions[bot] commented 6 days ago

Thanks for creating the issue in ML-Nexus!🎉 Before you start working on your PR, Pull the latest changes to avoid any merge conflicts.

Somyajain2004 commented 6 days ago

Could you add level to this issue?