Teradata / jupyter-demos

14 stars 16 forks source link

Make demo package available on Lake #352

Open DougEbel opened 1 year ago

DougEbel commented 1 year ago

We need to determine the packaging requirements for making the demos available on Lake. Generically, this is the same work for any Vantage Platform ... cloud, lake, VMware, IFX, .. owned by Teradata or customer. This means:

  1. Platform needs 17.20 or higher for compatibility with ClearScape functionality. (17.0 is sufficient if only needing the cloud READ_NOS capability). Releases earlier than 17.20 are not supportable for the demos.

  2. A platform is needed to run docker with Jupyter. This would either be a server in the cloud (could be tiny) or a package people could use to install docker+ jupyter on their laptop.

  3. Scripts are needed to set up the linkage to github in Jupyter so the demos notebooks can be refreshed. The script used in the provisioning of the ClearScape platforms can be a reference. The "Local" connection will need to be defined with the connection/credentials to lake. On ClearScape, the logon is "demo_user". For Teradata associates on a company platform, it should be their quicklook ID.

  4. The user must have a small amount of perm space (guessing 2-5 GB) for intermediate work tables.

  5. The user on the Vantage platform must be able to create databases, meaning they must have access to perm space. Demo databases are basically read-only. Writing is only done to the user's own space. We might want to modify CSAE "get_data" procedure to make the table databases read only and publicly accessible. For multi-user access (e.g. vantage-live), we should pre-create all of the demo databases. The "get_data" procedure already exits if the data already exists. The "cleanup" stored procedure could recognize that the demo databases were not the user's and exit. Another option is to prefix the databases created with the user’s logon id.

  6. Need to test how to get updated notebooks. In current CSAE platform, after running a notebook the user might respond to the message on closing the notebook to save it with the outputs. Or, they might be working on legitimate improvements to the notebook. If we automate a "git pull", it will fail if there are changes to the notebooks that github is tracking and if the "-f" option is used, it will wipe out the legitimate improvements being developed to existing notebooks. This is somehow handled on the current jupyter hub install for Vantage-Live.

DallasBowden commented 8 months ago

Have attempted to use a couple of use cases with Kevin's Lake Environment.