turbot / steampipe

Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.
https://steampipe.io
GNU Affero General Public License v3.0
6.84k stars 264 forks source link

CSV plugin is not reliable in loading tables in `steampipe query` with v0.20.x #3483

Closed e-gineer closed 1 year ago

e-gineer commented 1 year ago

I have a directory with a few CSV files, one of which is not valid (in fact it's JSON):

/tmp/crap $ ls
country.csv      json-not-csv.csv state.csv
/tmp/crap $ cat country.csv 
name,country
nathan,US
bob,US
/tmp/crap $ cat state.csv 
name,state
nathan,NJ
bob,PA
/tmp/crap $ cat json-not-csv.csv 
{
  "I'm": "json",
  "not": "valid CSV"
}
/tmp/crap $ 

I used JSON with a bad file extension just as an example. I'm not sure it matters that it is JSON. I'm not sure it even matters to have a bad CSV file for this bug report ... it's just what I was testing.

Now ... I'm finding that when I switch between directories Steampipe is unreliable at reading in the CSV plugin and tables. I see different results at different tries.

Here is a dump of the flow & attempts. Notably once it settles (which is quick) it doesn't change for that session. Sometimes it takes a moment in a session to refresh, which is understandable.

/tmp/crap $ cd ..
/tmp $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> .inspect csv
+----------------+------------------------------------------+
| table          | description                              |
+----------------+------------------------------------------+
| country        | CSV file at /tmp/crap/country.csv        |
| json-not-csv-2 | CSV file at /tmp/crap/json-not-csv-2.csv |
| state          | CSV file at /tmp/crap/state.csv          |
+----------------+------------------------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:51:53-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:51:53-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> 
/tmp $ cd crap
/tmp/crap $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> .inspect csv
+------------+--------------------------------------------+-------------+----------+-------+---------------------------+
| connection | plugin                                     | schema mode | state    | error | state updated             |
+------------+--------------------------------------------+-------------+----------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | updating |       | 2023-05-26T14:52:03-04:00 |
+------------+--------------------------------------------+-------------+----------+-------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:52:03-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:52:03-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:52:03-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:52:03-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:52:03-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:52:03-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> 

/tmp/crap $ 
/tmp/crap $ 
/tmp/crap $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> .inspect csv
+----------------+------------------------------------------+
| table          | description                              |
+----------------+------------------------------------------+
| country        | CSV file at /tmp/crap/country.csv        |
| json-not-csv-2 | CSV file at /tmp/crap/json-not-csv-2.csv |
| state          | CSV file at /tmp/crap/state.csv          |
+----------------+------------------------------------------+
> .inspect csv
+----------------+------------------------------------------+
| table          | description                              |
+----------------+------------------------------------------+
| country        | CSV file at /tmp/crap/country.csv        |
| json-not-csv-2 | CSV file at /tmp/crap/json-not-csv-2.csv |
| state          | CSV file at /tmp/crap/state.csv          |
+----------------+------------------------------------------+
> 

/tmp/crap $ 
/tmp/crap $ cd ..
/tmp $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> .inspect csv
+----------------+------------------------------------------+
| table          | description                              |
+----------------+------------------------------------------+
| country        | CSV file at /tmp/crap/country.csv        |
| json-not-csv-2 | CSV file at /tmp/crap/json-not-csv-2.csv |
| state          | CSV file at /tmp/crap/state.csv          |
+----------------+------------------------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:52:23-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:52:23-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> 

/tmp $ 
/tmp $ cd crap
/tmp/crap $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> .inspect csv
+------------+--------------------------------------------+-------------+----------+-------+---------------------------+
| connection | plugin                                     | schema mode | state    | error | state updated             |
+------------+--------------------------------------------+-------------+----------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | updating |       | 2023-05-26T14:52:32-04:00 |
+------------+--------------------------------------------+-------------+----------+-------+---------------------------+
> .inspect csv
+----------------+-------------+
| table          | description |
+----------------+-------------+
| country        |             |
| json-not-csv-2 |             |
| state          |             |
+----------------+-------------+
> .inspect csv
+----------------+-------------+
| table          | description |
+----------------+-------------+
| country        |             |
| json-not-csv-2 |             |
| state          |             |
+----------------+-------------+
> 

Even more interesting is the connection state. Notice how in the bottom examples the hash doesn't change but the loaded tables do?

/tmp $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> select * from steampipe_connection_state where name = 'csv'
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
| name | state | type | import_schema | error  | plugin                                     | schema_mode | schema_hash                      | comments_set | connection_mod_time       | plugin_mod_time           |
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
| csv  | ready |      | enabled       | <null> | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | d41d8cd98f00b204e9800998ecf8427e | true         | 2023-05-26T14:57:06-04:00 | 2023-05-15T09:04:32-04:00 |
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:57:06-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> 
/tmp $ cd crap
/tmp/crap $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> select * from steampipe_connection_state where name = 'csv'
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
| name | state | type | import_schema | error  | plugin                                     | schema_mode | schema_hash                      | comments_set | connection_mod_time       | plugin_mod_time           |
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
| csv  | ready |      | enabled       | <null> | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | 6c368b2c8e93381d7c35de58620b72ea | true         | 2023-05-26T14:57:20-04:00 | 2023-05-15T09:04:32-04:00 |
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
> .inspect csv
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| connection | plugin                                     | schema mode | state | error | state updated             |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
| csv        | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | ready |       | 2023-05-26T14:57:20-04:00 |
+------------+--------------------------------------------+-------------+-------+-------+---------------------------+
> 
/tmp/crap $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> select * from steampipe_connection_state where name = 'csv'
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
| name | state | type | import_schema | error  | plugin                                     | schema_mode | schema_hash                      | comments_set | connection_mod_time       | plugin_mod_time           |
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
| csv  | ready |      | enabled       | <null> | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | 6c368b2c8e93381d7c35de58620b72ea | true         | 2023-05-26T14:57:43-04:00 | 2023-05-15T09:04:32-04:00 |
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
> .inspect csv
+----------------+------------------------------------------+
| table          | description                              |
+----------------+------------------------------------------+
| country        | CSV file at /tmp/crap/country.csv        |
| json-not-csv-2 | CSV file at /tmp/crap/json-not-csv-2.csv |
| state          | CSV file at /tmp/crap/state.csv          |
+----------------+------------------------------------------+
> 
/tmp/crap $ 
e-gineer commented 1 year ago

This issue may be related to, and perhaps an easier way to reproduce, issue #3482

e-gineer commented 1 year ago

Noting that using an empty.csv file (zero bytes) is also helpful to make this problem worse / more reproducable.