Open shaofengshi opened 3 months ago
Another dependency in "gravitino-fileset-example.ipynb" is "hdfs" python lib:
"pip install hdfs"
and also gravitino: "pip install gravitino"
I see that the Jupyter notebook is using the "jupyter/minimal-notebook " image, which is not built by our-own, so we are not able to pre-install our dependencies, unless we build a new image.
Today in the jupyter notebook for trino, the first step is to instll trino and pandas library, see https://github.com/apache/gravitino-playground/blob/main/init/jupyter/gravitino-trino-example.ipynb. This step needs to access internect, while some users, they may get network problem here, and then block their evaluation. Besides, after execute this step, Jupyter reminds that you need to restart the kernel ("Note: you may need to restart the kernel to use updated packages."), this will bring confusing.
To impove the user experience, we can install these dependencies during build the docker image.