TKFRvisionOfficial / bszet_substitution_plan

Parsing and Image creation service of the BSZET substitution plan bot
GNU Affero General Public License v3.0
2 stars 0 forks source link

Combination of image and text in pdf #35

Open PBahner opened 2 years ago

PBahner commented 2 years ago

vertretungsplan-bgy06122021.pdf The first page of this pdf contains both - text and the table as image. Camelot is started and gives this warning because it can't find any table: UserWarning: No tables found in table area 1 [stream.py:365] The image to text recognition isn't executed because that is no error and camelot found text on the page...

We need to change the error handling in _convert_pdf_todataframes in util.py