Open jszwedko opened 3 years ago
Coworker came up with a possible idea to handle CSVs (credit goes to him, not me for this idea and these words). We haven't tried this yet, but it seems like it could work for small files:
parse_csv
the first line in .metadata as a .metadata.headerThe unnest will create new messages, one per line downstream with a .metadata.header and a .message, which can be further parsed/split and matched with the header values (how?).
Main limitation here is filesize as to do this it needs to store the whole CSV in memory, so that's probably a show stopper for many use cases. If anyone has ideas for improvements that would get around that then it may be a feasible idea.
Could a line_number prop be introduced in "read_from"? That way you can start at the second line and hardcode your headers as a local variable, then reference them against the array returned from parse_csv
Edit: a "beginning-skip-first-line" option would also work.
User in discord was trying to figure out how to use
remap
to handle CSV files: https://discord.com/channels/742820443487993987/746070591097798688/821038938424344626They are able to work with it by specifying the headers statically in their remap script like:
But it seems like it'd be nice for the
file
source to support reading CSV files natively where it would generate events using the CSV header to name the fields of each line.