os-climate / os_c_data_commons

Repository for Data Commons platform architecture overview, as well as developer and user documentation
Apache License 2.0
18 stars 10 forks source link

Please stage Superset 3.1.0 for evaluation (updated) #171

Open MichaelTiemannOSC opened 2 years ago

MichaelTiemannOSC commented 2 years ago

This enhancement alone looks very nice: https://github.com/apache/superset/pull/17917

We are currently running 1.3.2 on CL2, which is now quite old. Given that we have other Superset problems to solve, would be nice to have a current 2022 baseline.

https://pypi.org/project/apache-superset/1.5.0/

HumairAK commented 2 years ago

Much like for trino where @erikerlandson created an os-climate image (see here), we can do one similar for superset if @erikerlandson or someone else is willing to set this up. Then it should just be a matter of updating the image (and any outdated configurations as a result).

HeatherAck commented 2 years ago

adding @MightyNerdEric and @rynofinn to thread to see if one of them can support this request

MightyNerdEric commented 2 years ago

@HeatherAck We do not have the knowledge to help out with this at this time. In fact, if someone can help us fully understand this request (what's needed, who implements it, how it's implemented), it would be an excellent training opportunity for us.

MichaelTiemannOSC commented 2 years ago

I have set up Trino, SuperSet, and JupyterHub on my own desktop computer using standard images supplied by the respective projects. They are each packaged in a way that's easy for a developer to do so, in part because the respective projects have a good idea of the parameters of a desktop (or Docker) based deployment (including the concept of authorization/authentication of a well-known users, such as gasp admin/admin).

For OS-Climate, we need to deploy on cluster-based resources (which means Open Shift/Operate First) and we need to tie into OS-Climate authorization/authentication (which includes GitHub IDs and tokens, JWT tokens, etc).

However quickly each service is updated in the wild, there should be a straightforward process whereby a requested version can be staged, tested, and ultimately deployed as per the needs of the team. In this case, the latest version of superset (now 1.5.1, with 2.0.0rc0 released yesterday) is quite a step forward from the version selected 8 months ago.

With new versions come new configuration options (including the enablement of shiny new features using configuration parameters). The task of the update process is therefore not only staging the release onto OpenShift in the simplest possible way, but also tweaking the install so that the release is tuned to the expectations of users who requested the update. In this case, it means ultimately setting the correct feature flags (process described here: https://superset.apache.org/docs/installation/configuring-superset/).

Similarly, Trino has its own special config parameters, and JupyterHub has its own preferred set of libraries.

The particular features of SuperSet I care about (please set all to True) are:

    "DASHBOARD_NATIVE_FILTERS": True,
    "DASHBOARD_CROSS_FILTERS": False,
    # Feature is under active development and breaking changes are expected
    "DASHBOARD_NATIVE_FILTERS_SET": False,
    "DASHBOARD_FILTERS_EXPERIMENTAL": False,
    "GENERIC_CHART_AXES": False,
MichaelTiemannOSC commented 2 years ago

TIL that SuperSet 2.0.0 will move away from deprecated sqlalchemy-trino library. This will make it consistent with the rest of the OS-Climate standards and thus suggests that the 1.5.x migration would be more of a training mission, whereas 2.0.0 would bring true harmony to our world.

https://github.com/apache/superset/pull/19957

MichaelTiemannOSC commented 1 year ago

Updated request to stage SuperSet 2.0 for evaluation, which was released July 14th: https://apache-superset.slack.com/archives/CH307T4JG/p1657826078396289

eoriorda commented 1 year ago

Have to learn superset in order to deploy it . Good to have as there will be deprecation Ryan will look at this and see whats involved.

eoriorda commented 1 year ago

@HumairAK volunteered to help Ryan

HeatherAck commented 1 year ago

Image (see link below) - also may be a breaking change (we can leverage smaug - let people know to back up data) https://github.com/opendatahub-io/odh-images/tree/main/superset

eoriorda commented 1 year ago

On Ryans to do list . in backlog

HeatherAck commented 1 year ago

@rynofinn to work on this on 14-Nov; fork and verify it doesn't break anything in ODH

MichaelTiemannOSC commented 1 year ago

n.b. Superset 2.0.1 has reached rc3. Should be possible to sort large issues of deployment (build, install, test, credential/key management, etc) we should not lock in until 2.0.1 (mainly bugfix vs 2.0) has gone GA.

HeatherAck commented 1 year ago

researching deployment dependencies to determine other packages that need to be updated; reviewing config mgmt; CL1 smoke test, user validation. AICOE, Mutlu are only users.

HeatherAck commented 1 year ago

no new progress.

rynofinn commented 1 year ago
redmikhail commented 1 year ago

It appears that old superset version (or rather version of the client) is not working with new trino version https://operatefirst.slack.com/archives/C03LCPTPZ6J/p1676337680980699 . This task may need to be prioritized . cl1 also has older version of the trino than cl2 - this needs to be corrected to test the deployment. Suggestion is to use https://github.com/opendatahub-io/odh-images/tree/main/superset mentioned above as a base (we may need to move it to os-climate repo )

redmikhail commented 1 year ago

https://github.com/os-climate/os_c_data_commons/issues/266 is prerequisite for this issue . Currently there is no easy way to reproduce the issue that is happening with Superset on cl2

MichaelTiemannOSC commented 1 year ago

This new release candidate (2.1.0rc1) should not slow us down, but something we should track and either use as a better baseline (if waiting for #266 pushes us out too far), or something we should use to exercise our upgradeability mechanisms if not: https://github.com/apache/superset/discussions/23164

HeatherAck commented 1 year ago

@redmikhail will complete wk of 13-Mar

HeatherAck commented 1 year ago

has dependency on #266

HeatherAck commented 1 year ago

@ryanaslett to focus on week of 17-apr

HeatherAck commented 1 year ago

@ryanaslett to install this week 24-apr

HeatherAck commented 1 year ago

@ryanaslett to do this week of 8-may on cluster 1

HeatherAck commented 1 year ago

per @ryanaslett "full discovery of how its currently running is complete" he has a plan for what to do next and will update the github issue with details on 15-May

HeatherAck commented 1 year ago

@redmikhail is owning this issue. Will complete week of 12-June (2.1.0)

HeatherAck commented 1 year ago

@redmikhail will update in this order: open metadata, superset, trino - will complete the week of 19-June

HeatherAck commented 1 year ago

upgrade on CL1 for open metadata will get done tomorrow and will promote to CL2 if no issues {wed 28-Jun]. Matthew & Ryan to focus on Superset on Thurs/Fri [sync versions on CL1 and CL2 then upgrade to latest]; Eric to focus on Trino this week [sync versions on CL1 and CL2 then upgrade to latest].

HeatherAck commented 12 months ago

Dependencies (1) Trino version on CL1 (373) and CL2 (398) needs to be same version 398 @MightyNerdEric to contact @redmikhail to review steps (2) Trino's latest version 420 - pull from Data Mesh pattern @MightyNerdEric to own (3) Open MetaData - @redmikhail to upgrade this week and own; version 1.0.5 (4) Superset - (Trino must be done first); @ryanaslett to own; version 2.0.1

MichaelTiemannOSC commented 12 months ago

OM 1.1.0 just released: https://openmetadata.slack.com/archives/C02AZGN0WKY/p1688663302594159

Also, Superset 2.1.0 is actually a real release. Sorry for the confusion about 2.0.1.

MightyNerdEric commented 11 months ago

266 is complete. The next listed step is to get Trino to v420 (421 is out now; should be go to that instead?). I'm trying to get Keycloak in ops-argocd, so the Trino upgrade will need to wait until that's complete. Since the 4 tasks that Heather laid out need to be done in order, it doesn't seem necessary to have them assigned to different people. If @redmikhail or @ryanaslett have cycles to do the Trino upgrade, that may help us keep moving on this issue.

HeatherAck commented 11 months ago

@ryanaslett - identified steps to get upgraded. will get done this week (24-July)

HeatherAck commented 11 months ago

@ryanaslett - to focus on upgrade for week of 31-Jul

HeatherAck commented 10 months ago

@ryanaslett to deploy independent of ODH - will install as a new version of superset. Will use custom login info from old ODH. Will start work on 14-Aug.

HeatherAck commented 10 months ago

needs to go under app of apps but not under ODH; namespace: call it OSC-Superset under CL1 (see keycloak on CL3 as a pattern). Will install this week.

HeatherAck commented 10 months ago

used new data mesh strategy for installing superset - created new yaml, need to file a PR; no change from last week. @ryanaslett to send yaml to @rynofinn for review

HeatherAck commented 9 months ago

need to convert vault to external secrets and then redeploy. will complete this week 8-Sep

MichaelTiemannOSC commented 9 months ago

n.b. SuperSet 2.1.1 was finally released last week, which should be purely bugfixes. 3.0.0 is still only a release candidate.

MichaelTiemannOSC commented 9 months ago

As mentioned in other channels, 2.1.1 also fixes a severe CVE present in 2.1.0.

HeatherAck commented 9 months ago

@ryanaslett and @rynofinn working on it together. goal to finish in next week.

HeatherAck commented 9 months ago

Currently modifying the helm chart that the datamesh pattern/superset docs use to integrate some of the other pattens we have established for external secrets before attempting to push it.

eharrison24 commented 9 months ago

Ryan has some questions in regards to which dashboards are being used, once he figures out which dashboards are in use he will tag Michael Tiemann.

HeatherAck commented 4 months ago

adding @ModeSevenIndustrialSolutions and @strawberry-baked-alaska to schedule upgrade. cc'ing @MichaelTiemannOSC for awareness

MichaelTiemannOSC commented 4 months ago

Given how long this has been outstanding, and given that the Apache Superset project has released 3.1.0 (with 2.1.0 now being so old), I've updated the title of this Issue. Here's the PyPi package for the latest version: https://pypi.org/project/apache-superset/3.1.0/

ModeSevenIndustrialSolutions commented 4 months ago

Ryan and I will meet and discuss how we might be able to proceed with this...