To configure Tika to parse embedded documents recursively, you need to set the embedded parser in the parse context. If my reading of SolrCellBuilder is correct, Tika will only pull the contents out of the container document and will miss attachments.
To configure Tika to parse embedded documents recursively, you need to set the embedded parser in the parse context. If my reading of SolrCellBuilder is correct, Tika will only pull the contents out of the container document and will miss attachments.
See: https://issues.apache.org/jira/browse/SOLR-7189 and http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201507.mbox/%3CCAN4YXve24W++MKK1U-n0rp6JKNf-FQB10_ggRw4W4-Xy8dgP-w@mail.gmail.com%3E