WorldModelers / DART

Two Six Labs Data Acquisition & Reasoning Toolkit
0 stars 0 forks source link

Document 8e85eba8fd2b5fd607142ce0a6e5ffdf has no extracted text #12

Closed kwalcock closed 5 years ago

kwalcock commented 5 years ago

The metadata is returned in an elasticsearch, but the extracted text is not in the record. The file is not in the zip collection. I am not able to look at the raw file to see what might be wrong.

yanzv commented 5 years ago

@kwalcock 8e85eba8fd2b5fd607142ce0a6e5ffdf.pdf is an image scan of a document which is excluded from extraction of text at this time.

kwalcock commented 5 years ago

Thanks for checking. I've mapped the extracted text to "" if the value is missing.

yanzv commented 5 years ago

@kwalcock Thanks for the update. Can you please close the issue.