Closed summarizepaper closed 1 year ago
Looks like the extract_pages
function doesn't support async files.
How do we then extract text from a pdf with aiofiles?
I don't know, that's somewhat out of scope for aiofiles. You might want to try asking the authors of your pdf library though ;)
Hello, I'm really struggling to read my pdf files asynchronously with aiofiles. I want to extract the text from pdfs.
The routine that works is:
with open(pdf_filename, 'rb') as file:
but then if I replace with open(pdf_filename, 'rb') as file by async with aiofiles.open(pdf_filename, 'rb') as file, then the line async for page in extract_pages(file) is not happy and it says:
async for page in extract_pages(file): TypeError: 'async for' requires an object with aiter method, got generator
So how do I get the file returned by aiofiles to be like a normal file with aiter ?
Many thanks if you can tell me what is going on.