Open kakoni opened 8 years ago
Is this planned?
No plan for now. Currently, Fluentd's parsers don't have re-initialize configuration mechanizm. So if we need to support such metadata handling feature, we should re-design parser APIs.
Ok. Thanks
@kakoni Hard to set keys parameter in the configuration?
@repeatedly No but then I need to filter out headers rows. I wrote my own csvparser where I do something like;
def parse(text)
row = CSV.parse_line(text, col_sep: @delimiter)
if @keys.empty?
@keys = row
elsif (@keys - row).empty?
return
else
yield values_map(row)
end
end
Obviously this assumes that you read_from_head + input files are "immutable" that is they are written only once, no appends..
I see. We will consider it but we need more time to re-design Parser API because in_tail shares parser instance between target files. So using your own parser is better for now.
In v0.14 parser API design, <parse>
section can get arguments for many purposes. For example, it can be used for patterns of filename.
@type tail
path /my/dir/*.csv
<parse> # default pattern
@type csv
</parse>
<parse myfile.with.header.*.csv>
@type csv
csv_with_header true
</parse>
Sharing parsers for all files is from design of in_tail
plugin, not parsers.
@kakoni could you share the whole file (parser) and how to implement it please? @tagomoris is csv_with_header already implemented? can't find anything... :(
I showed just API capability, but it's not implemented yet.
@Ninir Heres an example https://gist.github.com/kakoni/b0ef238e630e65e860c83bfe55ffb53a
But obviously this would only work if you always read_from_head (which is exactly the case in my situation)
@tagomoris got it! @kakoni Thank you good sir, perfect :)
Hi, I'm using fluentd version 0.14.23 and I want to parse csv with header. I found this issue is opened for a long time. The csv2 plugin seems not working due to fluentd upgrade and I modify the csv2 plugin source to
require 'fluent/plugin/parser'
require 'csv'
module Fluent
module Plugin
class CSV2Parser < Parser
Plugin.register_parser('csv2', self)
config_param :keys, :array, value_type: :string
config_param :delimiter, :string, default: ','
def parse(text, &block)
values = CSV.parse_line(text, col_sep: @delimiter)
if @keys.empty?
@keys = values
elsif (@keys - values).empty?
return
else
r = Hash[@keys.zip(values)]
time, record = convert_values(parse_time(r), r)
yield time, record
end
end
end
end
end
it works! The example configuration is like below:
<source>
@type tail
path /my/path/to/csv
tag hello
format csv2
keys
</source>
<match **>
@type stdout
</match>
note: keys
should keep in configuration and the coming parameters should leave empty.
Is this supported now?
I guess currently in_tail/csv parser don't support csv files with headers? Is this planned?