buda-base / py-tiblegenc

Python script to convert PDFs using non-Unicode Tibetan fonts in Unicode text
Apache License 2.0
4 stars 2 forks source link

How to use #2

Open wykananda opened 1 year ago

wykananda commented 1 year ago

I've installed Python, PIP, and installed your code. However, I can't figure out how to use it to convert some pdf files we have that use Tibetan Machine to unicode. Which Python scripts do we run (pdfminer_text_converter, char_converter, deduff_pdf) and what are the parameters?

eroux commented 3 months ago

Hello, I'm just reading your issue, one year late! sorry for that

What I usually do is put the PDFs I need to convert in a input/ folder and then run python3 deduff_pdf.py, it creates an output/ folder with the converted texts