turbot / steampipe-plugin-googlesheets

Use SQL to instantly query spreadsheets, sheets, and cell data from Google Sheets. Open source CLI. No DB required.
https://hub.steampipe.io/plugins/turbot/googlesheets
Apache License 2.0
31 stars 0 forks source link

Automatically create each sheet into a table when no sheet is specified in the spc does not auto-populate #13

Closed bob-bot closed 1 year ago

bob-bot commented 2 years ago

Describe the bug Using the GoogleSheets plugin, when I do not specify sheets in the googlesheets.spc file, I expect all sheets to be retrieved into their own table per the comment in the spc file that "If no sheets are specified, then all sheets will be retrieved".

When not specifying any sheets, Steampipe does not automatically create each sheet into a table. However when I specify a table(s), it does add them as tables as expected.

Steampipe version (steampipe -v) v0.11.0

Plugin version (steampipe plugin list) v0.1.0

To reproduce I have tested this on my own sheet and the "Google Sheets Plugin - Sample School Data" sheet by Steampipe.

Add the Sample School data as your spreadsheet ID Keep 'sheets' variable left commented out Restart your service, .inspect the tables, you will see not tables created from the sheets. Expected to see Students, Marks, Books, etc

If you specify those sheets directly, they will show up as tables.

judell commented 2 years ago

Hmm. I suppose this could arguably be considered a feature, not a bug, given that sheets are often used in ways that don't involve well-defined tabular data?

bob-bot commented 2 years ago

Could be my misunderstanding of the comment in the googlesheets.spc file https://hub.steampipe.io/plugins/turbot/googlesheets#:~:text=%23%20If%20no%20sheets,Books%22%2C%20%22Marks%22%2C%20%22Employees%22%5D

I was coming off the CSV plugin which nicely creates each .csv into their own table within the path(s) specified. So when I tried the googlesheets plugin, i took the retrieval commentary in the spc to be a similar feature in the CSV plugin to auto-create tables.

I can refactor the issue into a feature request if more appropriate

judell commented 2 years ago

Ah, I think you're correct, this may be a doc bug vs a software bug, @cody / @misraved? Or do we really intend to map all sheets to tables?

rajlearner17 commented 2 years ago

@judell @bob-bot Thanks for the suggestion, I have tested this out in a massive spreadsheet (with more than 50 sheets with a lot of data), when I specify sheets = ["tf-aws-compliance-Queries", "tf_azure_compliance_Queries"] as applicable to me, it generates the table I am looking for, however, I am thinking if we auto-convert them to all tables, it might be hitting some timeout/API limit issue. I guess this is more how we handle them.

In CSV, as it's a single file, if we provide the path as below, it takes all the CSVs and generates them into the table.

connection "csv" {
  plugin = "csv"
  paths  = [ "/path/to/your/files/*.csv" ]
}

Looking at this code, it seems this is intentionally designed. @cbruno10 can provide more comments.

judell commented 2 years ago

@rajlearner17 @bob-bot I am also thinking that not every sheet has tabular data that can flow seamlessly to a db table. People use sheets in all kinds of ways. So for that reason, in this case as distinct from CSV or other dynamic-schema plugins, it may make most sense to have the user be explicit about which sheets to convert.

cbruno10 commented 2 years ago

@bob-bot Yes, it seems like our comment in the default googlesheets.spc file is incorrect, as we actually only load the sheets specified in sheets.

In general, we usually take an opt-in approach to creating tables, especially when we need to query an API to get a list of resources, objects, etc., that we base the new tables off of. I also have concerns like @rajlearner17 and @judell mentioned about loading too many by default, some of which may not convertible into a CSV format (restrictions are noted in Table Restrictions and Notes).

An alternative would be to support wildcarding, e.g., "*" would match all sheets, while "prefix_*" would match any sheet starting with prefix_. So it doesn't change the default behaviour where if sheets is not specified or equal to [], we'd load all sheets, but it would provide an easy way to load all sheets with something like "*".

@bob-bot @judell Thoughts?

judell commented 2 years ago

An alternative would be to support wildcarding

Sounds good to me. @bob-bot? Tim Nelson?

tnelson-doghouse commented 2 years ago

Yeah, I agree with the wildcarding. If we wanted to be really complete, we'd:

github-actions[bot] commented 1 year ago

'This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.'

tnelson-doghouse commented 1 year ago

@github-action : But has it been fixed? :p

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

tnelson-doghouse commented 1 year ago

It is stale, but can still be eaten :p

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

tnelson-doghouse commented 1 year ago

The staler it gets, the tastier it is! :p