venveo / craft-documentsearch

Automatically extract keywords from documents in Craft CMS and inject them into the search index
Other
6 stars 2 forks source link

no keywords found #9

Closed koobadev closed 2 years ago

koobadev commented 2 years ago

Perhaps I've configured this wrongly, however, I cant seem to get any PDFs returned in my search results. I can see my PDF assets listed in the 'searchindex' table and the filename listed within the 'keywords' column, should I be seeing any extracted text in this same column in the database? Thanks!

Mosnar commented 2 years ago

Have you confirmed the pdftotext executable path is correct and PHP has permission to access it? If so, the next thing to verify is that your pdf actually has real text embedded in it - not images of text (e.g. photo copies)

If all that looks correct, could you check your logs for any errors produced by PHP when a pdf asset is saved?

koobadev commented 2 years ago

Thanks for your reply @Mosnar I believe the path is correct as I'm no longer seeing the associated error msg regarding this on the settings interface. I can confirm I've been testing PDFs with real text. I am seeing that 'Updating search index' has failed in the queue manager within the control panel when I save a PDF to asssets, however, I'm not seeing any further details on why this has occured and there's no obvious issue report in web.logs