Export CSV To Influx: Process CSV data, and write the data to influx db
Important Note: Influx 2.x has build-in csv write feature, it is more powerful: https://docs.influxdata.com/influxdb/v2.1/write-data/developer-tools/csv/
Use the pip to install the library. Then the binary export_csv_to_influx is ready.
pip install ExportCsvToInflux
You could use export_csv_to_influx -h
to see the help guide.
Note:
- You could pass
*
to --field_columns to match all the fields:--field_columns=*
,--field_columns '*'
- CSV data won't insert into influx again if no update. Use to force insert, default True:
--force_insert_even_csv_no_update=True
,--force_insert_even_csv_no_update True
- If some csv cells have no value, auto fill the influx db based on column data type:
int: -999
,float: -999.0
,string: -
# | Option | Mandatory | Default | Description |
---|---|---|---|---|
1 | -c, --csv |
Yes | CSV file path, or the folder path | |
2 | -db, --dbname |
For 0.x, 1.x only: Yes | InfluxDB Database name | |
3 | -u, --user |
For 0.x, 1.x only: No | admin | InfluxDB User name |
4 | -p, --password |
For 0.x, 1.x only: No | admin | InfluxDB Password |
5 | -org, --org |
For 2.x only: No | my-org | For 2.x only, my org |
6 | -bucket, --bucket |
For 2.x only: No | my-bucket | For 2.x only, my bucket |
7 | -http_schema, --http_schema |
For 2.x only: No | http | For 2.x only, influxdb http schema, could be http or https |
8 | -token, --token |
For 2.x only: Yes | For 2.x only, n | |
9 | -m, --measurement |
Yes | Measurement name | |
10 | -fc, --field_columns |
Yes | List of csv columns to use as fields, separated by comma | |
11 | -tc, --tag_columns |
No | None | List of csv columns to use as tags, separated by comma |
12 | -d, --delimiter |
No | , | CSV delimiter |
13 | -lt, --lineterminator |
No | \n | CSV lineterminator |
14 | -s, --server |
No | localhost:8086 | InfluxDB Server address |
15 | -t, --time_column |
No | timestamp | Timestamp column name. If no timestamp column, the timestamp is set to the last file modify time for whole csv rows. Note: Also support the pure timestamp, like: 1517587275. Auto detected |
16 | -tf, --time_format |
No | %Y-%m-%d %H:%M:%S | Timestamp format, see more: https://strftime.org/ |
17 | -tz, --time_zone |
No | UTC | Timezone of supplied data |
18 | -b, --batch_size |
No | 500 | Batch size when inserting data to influx |
19 | -lslc, --limit_string_length_columns |
No | None | Limit string length column, separated by comma |
20 | -ls, --limit_length |
No | 20 | Limit length |
21 | -dd, --drop_database |
Compatible with 2.x: No | False | Drop database or bucket before inserting data |
22 | -dm, --drop_measurement |
No | False | Drop measurement before inserting data |
23 | -mc, --match_columns |
No | None | Match the data you want to get for certain columns, separated by comma. Match Rule: All matches, then match |
24 | -mbs, --match_by_string |
No | None | Match by string, separated by comma |
25 | -mbr, --match_by_regex |
No | None | Match by regex, separated by comma |
26 | -fic, --filter_columns |
No | None | Filter the data you want to filter for certain columns, separated by comma. Filter Rule: Any one filter success, the filter |
27 | -fibs, --filter_by_string |
No | None | Filter by string, separated by comma |
28 | -fibr, --filter_by_regex |
No | None | Filter by regex, separated by comma |
29 | -ecm, --enable_count_measurement |
No | False | Enable count measurement |
30 | -fi, --force_insert_even_csv_no_update |
No | True | Force insert data to influx, even csv no update |
31 | -fsc, --force_string_columns |
No | None | Force columns as string type, separated as comma |
32 | -fintc, --force_int_columns |
No | None | Force columns as int type, separated as comma |
33 | -ffc, --force_float_columns |
No | None | Force columns as float type, separated as comma |
34 | -uniq, --unique |
No | False | Write duplicated points |
35 | --csv_charset, --csv_charset |
No | None | The csv charset. Default: None, which will auto detect |
Also, we could run the exporter programmatically.
from ExportCsvToInflux import ExporterObject
exporter = ExporterObject()
exporter.export_csv_to_influx(...)
# You could get the export_csv_to_influx parameter details by:
print(exporter.export_csv_to_influx.__doc__)
timestamp,url,response_time
2022-03-08 02:04:05,https://jmeter.apache.org/,1.434
2022-03-08 02:04:06,https://jmeter.apache.org/,2.434
2022-03-08 02:04:07,https://jmeter.apache.org/,1.200
2022-03-08 02:04:08,https://jmeter.apache.org/,1.675
2022-03-08 02:04:09,https://jmeter.apache.org/,2.265
2022-03-08 02:04:10,https://sample-demo.org/,1.430
2022-03-08 03:54:13,https://sample-show.org/,1.300
2022-03-07 04:06:00,https://sample-7.org/,1.289
2022-03-07 05:45:34,https://sample-8.org/,2.876
Command samples
If enable the count measurement, the count measurement is:
// Influx 0.x, 1.x
select * from "demo.count"
name: demo.count
time match_timestamp match_url total
---- --------------- --------- -----
1562957134000000000 3 2 9
// Influx 2.x: For more info about Flux, see https://docs.influxdata.com/influxdb/v2.1/query-data/flux/
influx query 'from(bucket:"my-bucket") |> range(start:-100h) |> filter(fn: (r) => r._measurement == "demo.count")' --raw
#group,false,false,true,true,false,false,true,true
#datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,long,string,string
#default,_result,,,,,,,
,result,table,_start,_stop,_time,_value,_field,_measurement
,,2,2022-03-04T09:51:49.7425566Z,2022-03-08T13:51:49.7425566Z,2022-03-07T05:45:34Z,2,match_timestamp,demo.count
,,3,2022-03-04T09:51:49.7425566Z,2022-03-08T13:51:49.7425566Z,2022-03-07T05:45:34Z,2,match_url,demo.count
,,4,2022-03-04T09:51:49.7425566Z,2022-03-08T13:51:49.7425566Z,2022-03-07T05:45:34Z,9,total,demo.count
The lib is inspired by: https://github.com/fabio-miranda/csv-to-influxdb