Closed ghost closed 2 years ago
Hi, first of all the library is really good.
I tried to run this library on windows 10 and it doesn't work. I believe I did everything right, installed Tesseract and ran the following code:
from multilingual_pdf2text.pdf2text import PDF2Text from multilingual_pdf2text.models.document_model.document import Document import logging from utils import write_txt logging.basicConfig(level=logging.INFO) def main(): ## create document for extraction with configurations pdf_document = Document(document_path="./pdfs_samples/page1.pdf", language="por") pdf2text = PDF2Text(document=pdf_document) content = pdf2text.extract() for page in content: print(page["text"]) write_txt(page["text"], filename="output_multilingual_pdf2text1.txt") if __name__ == "__main__": main()
I ran this same code on linux(ubuntu 20.04) and it worked perfectly. So, was wondering if the library doesn't support windows?
@richecr As long as you are able to install Tessaract on Windows this library would work fine. You can take a look at this article Installing and using Tesseract 4 on windows 10
Hi, first of all the library is really good.
I tried to run this library on windows 10 and it doesn't work. I believe I did everything right, installed Tesseract and ran the following code:
I ran this same code on linux(ubuntu 20.04) and it worked perfectly. So, was wondering if the library doesn't support windows?