Closed dmacko232 closed 1 year ago
I do not have time to fix this. fname[-6:-4]
is likely wrong as well as triple-digit pagecounts are reasonable. Better do a split on the "-"
and the "."
.
Thanks for noticing, I'll try to fix that. When called via convert_certification_report()
function (and others), it should be wrapped in try-except block, so I don't see it as critical. But I'll look into it.
I drafted a fix:
works on my machine :)
When ocr is used to convert pdf to text it fails with error message
Error during OCR of, using garbage: invalid literal for int() with base 10: 'mpzvoklker/image-06'
The issue is most likely caused by wrong slicing of string (
fname[6:-4]
instead offname[-6:-4]
) https://github.com/crocs-muni/sec-certs/blob/9d1d44d04532609524fd862697179e179a6ea92c/src/sec_certs/utils/pdf.py#L64