elastic / elasticsearch-mapper-attachments

Mapper Attachments Type plugin for Elasticsearch
https://www.elastic.co
Apache License 2.0
504 stars 94 forks source link

Searching iWork files #210

Closed kirka121 closed 8 years ago

kirka121 commented 8 years ago

i can parse and search for microsoft, plain text, html and plefora of other files without issues. but when i try to index any of the iWork files, it fails.

I am using elasticsearch::rails and elasticsearch::models plugins for my rails website. my es version is 1.7, mapper attachments at 2.7.1 and Tika at 1.10, docs say Tika dropped support for iWork between versions 1.0 and 1.5, however 1.10 should have it.

Same problem persists when on loclhost as well as on Heroku with Bonsai ElasticSearch (who i contacted, and they said everything should be fine on their end if .docx works perfectly, and told me to talk to you guys as well as Tika mailing list)

more info here: http://stackoverflow.com/questions/36500827/elasticsaerch-rails-cant-search-for-mac-extensions

dadoonet commented 8 years ago

We reduced the number of dependencies so number of formats that Tika normally exposed. See #163.

We won't modify it in mapper attachment plugin as it has been deprecated now.

But I can encourage you to submit a feature request (and you can link to this one if you wish) in elasticsearch repo where the ingest-attachment plugin is now living. Ideally if you could provide a sample document you wish to support, that could be a nice integration test for this plugin.

Thanks!