codeforsanjose / city-agenda-scraper

9 stars 16 forks source link

pdf to text.py doesn't work for San Jose city documents #6

Open krammy19 opened 3 years ago

krammy19 commented 3 years ago

There's a bug in our script that draws text from pdfs. The San Jose test documents (SanJose_Legistar & SanJose_Legistar2) both generate blank text.

Here's the error log:

SanJose_Legistar.pdf unsuccessful <class 'UnicodeEncodeError'> ('charmap'

wall of text!!!!

character maps to undefined

jbdundas commented 3 years ago

I will take this up. Thanks for the heads up!