AgileWorksOrg / elasticsearch-river-csv

CSV river for ElasticSearch
Apache License 2.0
91 stars 45 forks source link

script_before_file #18

Closed btray77 closed 10 years ago

btray77 commented 10 years ago

Is there something special I need to do to get this to print out to the screen what the file is doing?

I'm getting an error and I am not able to tell if it's because it's not setting a variable or....

vtajzich commented 10 years ago

I don't know what you mean. take a look at https://github.com/xxBedy/elasticsearch-river-csv/blob/master/src/test/resources/shell_scripts/before_file.sh

file name is an argument to the script.

btray77 commented 10 years ago
[2014-03-29 05:29:39,550][INFO ][http                     ] [cowboy] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/107.170.114.148:9200]}
[2014-03-29 05:29:40,510][INFO ][gateway                  ] [cowboy] recovered [2] indices into cluster_state
[2014-03-29 05:29:40,511][INFO ][node                     ] [cowboy] started
[2014-03-29 05:29:42,169][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] starting csv stream
[2014-03-29 05:29:42,282][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] Using configuration: org.agileworks.elasticsearch.river.csv.Configuration(/ftp/cj, .*\.txt$, true, [], 1h, cj, csv_type, 1000, \, ", ,, 100, 2, id, /ftp/clean.sh, /ftp/clean.sh, /ftp/cj/before.sh, null)
[2014-03-29 05:29:42,282][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] Going to process files /ftp/cj/Jigsaw_Health-JigsawHealth_Product_Catalog.txt
[2014-03-29 05:29:42,321][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] 
[2014-03-29 05:29:42,322][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] Processing file Jigsaw_Health-JigsawHealth_Product_Catalog.txt
[2014-03-29 05:29:42,518][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] File has been processed Jigsaw_Health-JigsawHealth_Product_Catalog.txt.processing
[2014-03-29 05:29:42,547][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] File Jigsaw_Health-JigsawHealth_Product_Catalog.txt.processing, processed lines 29
[2014-03-29 05:29:42,547][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] Going to execute new bulk composed of 28 actions
[2014-03-29 05:29:42,552][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] next run waiting for 1h
[2014-03-29 05:29:42,659][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [cowboy] [csv][cj] Executed bulk composed of 28 actions

before.sh

#!/bin/bash
echo before file $1

No where in the log does it show the output from before.sh (before file filename.txt). My before file when I run it manually works great, but when I have the software run it, it errors out like its not getting the file name passed to it.

Does it rename the file before it runs the before script?

vtajzich commented 10 years ago

Correct script should look like

#!/bin/bash
echo "before file $1"

File name is changing during processing as the original file name being modified. So before file will get *.processing and so on. I think it is wanted behaviour to get actual file name so you don't have to guessing what is actual file name....

You must specify scripts path in request. See read me file.

btray77 commented 10 years ago

I may be looking for my informaiton in the wrong place. Is the output from the script going to be reflected in the log? The words "before file" it no where to be found in the logs.

vtajzich commented 10 years ago

Yes, it is logged using EsLogger, so it should be printed in logs. Send me your scripts and request.

btray77 commented 10 years ago

Typo in bash file. Thank you for the help.