10up / ElasticPress

A fast and flexible search and query engine for WordPress.
https://elasticpress.io
GNU General Public License v2.0
1.24k stars 312 forks source link

Text files not ingested #3874

Closed jerasokcm closed 5 months ago

jerasokcm commented 5 months ago

Describe your question

I'm using ElasticPress 5.0.0 with WordPress 6.4.3. The documents feature is enabled and works well when attached document is a PDF or a Ms-Word one. However, when a use a plain TXT file, the document seems to be ignored by the Elasticsearch index. I have tested using the same content, with some especial words included. If the text is inside a Ms-Word document, those words are found. On the other hand, if I use a TXT document for the same content, there are no resulting matches.

Some suggestions will be appreciated.

Best regards.

Code of Conduct

felipeelia commented 5 months ago

Hi @jerasokcm,

You can have it working by adding this snippet to your codebase:

add_filter(
    'ep_allowed_documents_ingest_mime_types',
    function ( $mime_types ) {
        $mime_types['txt'] = 'text/plain';
        return $mime_types;
    }
);
jerasokcm commented 5 months ago

Thanks, @felipeelia I will give it a try.