salyh / elasticsearch-imap

IMAP and POP3 email importer for Elasticsearch (no river anymore)
Apache License 2.0
100 stars 25 forks source link

Add river configuration property headers_to_fields #8

Closed hansjorg closed 10 years ago

hansjorg commented 10 years ago

I ran into a small problem when indexing mailing list archives. Since the e-mail headers are stored in an array of an inner object, the names and values get flattened when indexing. This means that while it's possible to filter on a header like "Message-ID" by using field name "header.value", it's not possible to know if it's that header or another, like "References" or "In-Reply-To" that's actually matching.

Since one probably doesn't want to create fields for all headers, I've added a new river configuration property (headers_to_fields) where one can list headers that should be copied to fields.

A header name like "Message-ID" creates a field with name "header_message_id".

salyh commented 10 years ago

Hi, thanks a lot for this, very appreciated