apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
61.69k stars 13.49k forks source link

[SIP-114] Proposal for Google Sheets Export #26390

Closed jessie-ross closed 3 months ago

jessie-ross commented 8 months ago

[SIP-114] Proposal for Google Sheets Export

Motivation

One-click export to Google Sheets is an important usability improvement that's desired at my company, as well as generally by Superset users:

Uploading the data directly to Google Sheets removes a barrier for less technical users, allowing them to work with data in a more familiar way, and is a major workflow improvement for users that have to frequently upload data to Google Sheets to share.

Proposed Change

I propose adding Google Sheets as an export option for SqlLab results and Reports, as well as providing a bit more of a framework for future export extensions, both for the UI and code.

See the working implementation in this PR.

New UI:

Screenshot 2024-01-02 at 4 25 34 pm

https://github.com/apache/superset/assets/971886/c495dcb4-84ee-4d5c-9353-decefd71cbf0

Screenshot 2024-01-12 at 3 35 18 pm

New or Changed Public Interfaces

New dependencies

New pypi packages:

Migration Plan and Compatibility

No migrations necessary, backwards compatible.

Slight UI change:

Rejected Alternatives

Alternate UI:

Front-end API:

betodealmeida commented 8 months ago

The existing gsheets package does not allow writing of data, only reading, making this package necessary.

Note that even though the optional dependency name in Superset is called gsheets, the actual dependency is on the shillelagh package, instead of the old gsheets package, which does allow writing data to Google Sheets.

How hard do you think would it be to reuse the GSheetsEngineSpec.df_to_sql method in the Google Sheets DB engine spec instead of adding the gspread dependency?

jessie-ross commented 8 months ago

Note that even though the optional dependency name in Superset is called gsheets, the actual dependency is on the shillelagh package, instead of the old gsheets package, which does allow writing data to Google Sheets.

Ah interesting, seems I got mixed up in my search, I think I did look into Shillelagh too and it seemed to unable to create a new sheet.

How hard do you think would it be to reuse the GSheetsEngineSpec.df_to_sql method in the Google Sheets DB engine spec instead of adding the gspread dependency?

It looks like we aren't actually using Shillelagh, and are instead using it's internal requests session to make authenticated calls to the Google API directly, I can look into using this method, I would likely need to do some re-factoring with this code, and look into how using the Google Sheets engine authentication secrets for export would work.

jessie-ross commented 8 months ago

Note there is also an ongoing discussion in the PR on the UX for loading the export (new page vs same page): https://github.com/apache/superset/pull/26391#discussion_r1439663162

Please discuss in the thread there, I will reply here after it has reached a conclusion.

jessie-ross commented 8 months ago

[...] look into how using the Google Sheets engine authentication secrets for export would work [...]

The question here is if we want to use the same authentication or separate authentication for export.

For our use-case we'd want separate authentication, as we wouldn't want an extra DB engine to show up (we don't use Google Sheets as an adaptor.)

I'm not really familiar with the UI based configuration system as we exclusively use superset_config.py, but from my perspective it seems simple enough to have a separate config variable for GOOGLE_SHEETS_EXPORT_SERVICE_ACCOUNT_JSON_PATH.

I could detect if a service account is available with a Google Sheets engine to automatically enable Google Sheets export, but this becomes complicated if setup has more than one Google Sheets engine, and auto-enable like this might lead to unintended issues if an admin isn't expecting it.

jessie-ross commented 8 months ago

Amendment to the SIP:

I would like to add a matching Google Sheets export option for Reports:

Screenshot 2024-01-12 at 3 35 18 pm

I've updated the issue to add this in.

jessie-ross commented 7 months ago

Hi @betodealmeida, just wondering if there was any feedback on this

rusackas commented 7 months ago

...just wondering if there was any feedback on this

Hi @jessie-ross - I think with a SIP, the goal is to get feedback from as many people as possible to iron out any wrinkles in the proposal. You'll need to kick off a discussion on the Superset dev mailing list to get the official process started. Let me know if you need any help with that. Thanks!

jessie-ross commented 7 months ago

Hi @rusackas, just in case it was missed, I replied here on slack: https://apache-superset.slack.com/archives/C015WAR4SDV/p1707473597565419?thread_ts=1704263733.082299&cid=C015WAR4SDV

rusackas commented 7 months ago

Ah, thanks... it was the mailing list thread I didn't find :)

betodealmeida commented 7 months ago

For our use-case we'd want separate authentication, as we wouldn't want an extra DB engine to show up (we don't use Google Sheets as an adaptor.)

Ah, makes sense.

I could detect if a service account is available with a Google Sheets engine to automatically enable Google Sheets export, but this becomes complicated if setup has more than one Google Sheets engine, and auto-enable like this might lead to unintended issues if an admin isn't expecting it.

Yeah, I agree.

rusackas commented 3 months ago

This issue is being closed since the vote passed, and the SIP has served its purpose. Anyone (especially @jessie-ross ) is now clear to move forward with any implementation.

jessie-ross commented 3 months ago

Thanks Evan, this is next on my list :)

kasiazjc commented 1 month ago

@jessie-ross thanks for the SIP! It looks great :) I think there is one thing that has changed in the meantime, which is reports/alerts modal (screenshot below). Instead of radio buttons we now use a simple select field, so not sure how it will affect your work

Image

rusackas commented 1 month ago

@kasiazjc I think there are a number of places where we save as or export csv/xls/pdf, whatever. I wonder if there's a need to audit and consolidate this as a pattern more globally. Or am I worrying too much?

kasiazjc commented 1 month ago

@kasiazjc I think there are a number of places where we save as or export csv/xls/pdf, whatever. I wonder if there's a need to audit and consolidate this as a pattern more globally. Or am I worrying too much?

I think the most noticeable/used cases were the ones in alerts and reports and they are already gone as the screenshot that I posted is already implemented :) so I would say - let's not worry about it now.