Closed benjamin-kett closed 10 years ago
Could you share a TNEF
document I can use for testing? Can I reuse that document in tests?
I would like to check if Tika 1.5 solved it.
If you agree, could you sign the CLA? http://www.elasticsearch.org/contributor-agreement/
No news on this. Closing. Feel free to reopen with any new information.
When indexing a TNEF document, The contents of the winmail.dat attachment are not searchable, even though Tika 1.4 has a TnefParser and (I think) is being run on the document. The index returns a content type of message/rfc822 when indexing an email containing a winmail.dat attachment, or text/plain; charset=windows-1252 when indexing the raw attachment.
Both of these types are incorrect, the content type is application/ms-tnef, and the attachment content remains in base64 and isn't searchable. I have tested this on plugin version 1.9.0 with es 0.9.11