microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.26k stars 3.12k forks source link

Image to Text (OCR) - Kernel Content #5846

Closed sophialagerkranspandey closed 1 month ago

sophialagerkranspandey commented 4 months ago

User requirements: For Kernel Content modalities (Image to Text, Audio to Text, etc) customer identified their top priority of Image to Text, specifically for business PDF's, graphs and having the ability to have detailed text extracted from the image.

Models/ Examples:

  1. OCR for images - Azure AI Vision - Azure AI services | Microsoft Learn
  2. Automatically extract text and structured data from documents with Amazon Textract | AWS Machine Learning Blog
  3. Donut (huggingface.co) o jinhybr/OCR-Donut-CORD · Hugging Face

Introduce new model types

  1. Service that takes an Image and supports OCR (name to be determined)
  2. Introduce content type with text for content types (name to be determined)
  3. Connectors to Azure OCR, AWS Textract (name to be determined)
github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 90 days with no activity.

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 14 days since being marked as stale.