Isabelle Gribomont, Liz Fischer, Ryan Cordell, Clemens Neudecker
Topics (keywords)
DH, Open Education, Open Access, Python
Learning outcomes
After completing this lesson, you will be able to:
Combine Google Vision’s character recognition with Tesseract’s layout detection to generate high-quality OCR outputs for a wide range of documents
Accurately convert PDF files into plain text
Understand a variety of considerations to keep in mind when converting a PDF to plain text
Abstract
Google Vision and Tesseract are both popular and powerful OCR tools, but they each have their weaknesses. In this lesson, you will learn how to combine the two to make the most of their individual strengths and achieve even more accurate OCR results.
Title of the resource
OCR with Google Vision API and Tesseract
Resource type
External Resource
Authors, editors and contributors
Isabelle Gribomont, Liz Fischer, Ryan Cordell, Clemens Neudecker
Topics (keywords)
DH, Open Education, Open Access, Python
Learning outcomes
After completing this lesson, you will be able to:
Abstract
Google Vision and Tesseract are both popular and powerful OCR tools, but they each have their weaknesses. In this lesson, you will learn how to combine the two to make the most of their individual strengths and achieve even more accurate OCR results.