logstash-plugins / logstash-filter-csv

Apache License 2.0
15 stars 41 forks source link

Autodetect_column_names take header from second row #67

Closed lyeesim988 closed 6 years ago

lyeesim988 commented 6 years ago

autodetect_column_name for csv plugin not able to take first row of csv file as header, instead it takes the second row of data as the header. The first run it managed to get the first row as header but if continue with another few times of re-run, the second row of data values are taken as header.

Similar issues: https://discuss.elastic.co/t/autodetect-columns-from-header-row-csv-ingest/92861 https://discuss.elastic.co/t/autodetect-column-is-taking-data-row-as-column-name-instead-of-header/126383

siben168 commented 6 years ago

i have the same issue, I was debugging my conf, but in many case it will load the second row as the header. it is weird that if i add "convert" function, it alwasy messed up headers.

siben168 commented 6 years ago

I've learned this from another thread, this is because the race issue for multiple workers. currently the workaround is to set "pipeline.workers: 1" in logstash.yml, it works for me.

It seems that when you have multiple workers, the order of lines could not be guaranteed, definitely this is a bug.

lyeesim988 commented 6 years ago

Hi siben168,

Thank you very much. By setting the worker value to 1 has solved my issue.

pemontto commented 6 years ago

Duplicate of #65

jplew commented 6 years ago

in case you don't have a logstash.yml, create a new one in settings/logstash.yml and paste this inside: pipeline.workers: 1

Reference: https://www.elastic.co/guide/en/logstash/6.3/logstash-settings-file.html

guyboertje commented 6 years ago

Duplicate, closing.