equalitie / open-corroborator

Data analysis and fact corroboration
http://equalitie.github.io/open-corroborator/
7 stars 3 forks source link

Index and search within attachments #59

Closed graphiclunarkid closed 8 years ago

graphiclunarkid commented 8 years ago

Index and search within attachments .docx, .xlsx, .rtf, .txt, *.pdf.

ggaughan commented 8 years ago

The contents of documents attached to bulletins should now be now indexed.

(This can be switched off via the INDEX_MEDIA_CONTENT setting)

Deployed to demo2 and demo3.

florianap commented 8 years ago

I tried to look for strings contained in a PDF and in a doc file uploaded to the media. Perhaps the search worked, but what I see afterwards is a list of bulletins, without anything telling me where that string is. If it's in a file, I should be able to identify the file where it is contained.

graphiclunarkid commented 8 years ago

If the previous situation was that the bulletins wouldn't be returned at all because the contents of the documents weren't being indexed, I think we can close this issue because the search does now index attachments, but we should also raise another issue to make the presentation of search results better.