Closed kwalcock closed 5 years ago
@kwalcock 8e85eba8fd2b5fd607142ce0a6e5ffdf.pdf is an image scan of a document which is excluded from extraction of text at this time.
Thanks for checking. I've mapped the extracted text to "" if the value is missing.
@kwalcock Thanks for the update. Can you please close the issue.
The metadata is returned in an elasticsearch, but the extracted text is not in the record. The file is not in the zip collection. I am not able to look at the raw file to see what might be wrong.