logstash-plugins / logstash-input-jdbc

Logstash Plugin for JDBC Inputs
Apache License 2.0
449 stars 187 forks source link

Support for Nested Documents and Arrays #42

Open ghayes opened 9 years ago

ghayes commented 9 years ago

In other implementations of JDBC input tools for ElasticSearch, you can specify Nested Documents or Arrays based on the naming convention of the column. Something like "contact.phone" would create a nested document of type phone. Or contact[phone] would create an array of phone numbers. So, a result set like:

uid, contact[phone] 1, 8001111111 1, 8002222222

would result in a document in ElasticSearch like 1, [8001111111,8002222222].

This issue was also brought up in Elastic Discuss: https://discuss.elastic.co/t/logstash-jdbc-input-for-multi-fields-and-nested-objects/27020

Thanks.

suyograo commented 8 years ago

@ghayes this is a tricky feature to implement in LS. Mapping the relation in DB to ES is not straightforward and cannot be done in a generic way in LS. In addition, logstash-output-elasticsearch already supports parent-child relationship. Can you use that?

Also, to reliably map relationship from your tables, you'd have to run multiple queries, store state and join then in LS. You then map it to a doc structure so ES can index it.

loganbhardy commented 8 years ago

@suyograo I think you may have misinterpreted this request. The JDBC River Plugin allowed you to (SELECT column AS 'object1.object2') and it would index the column as a nested object using the dot as the delimiter. There is no tricky relationship mapping involved at all.

When doing the same select with the logstash jdbc input a single flat field named "object1.object2" is created. This is the only issue preventing me from moving away from using the rivers plugin and onto the latest version of elasticsearch.

Jrizzi1 commented 8 years ago

this is the same issue as i opened, https://github.com/logstash-plugins/logstash-input-jdbc/issues/80 and again, I think there's too much being put into tricky relationships

all thats being requested here is the ability for logstash-jdbc-input to structure database queries for json for bulk entry

girishla commented 8 years ago

Please note that although Rivers are no longer available in elasticsearch, we can still use elasticsearch-jdbc - which used to be the jdbc river - but now is a standalone tool. It supports the bracket and dot notation for nesting and arrays. I've used it without any issues in elasticsearch v2.3