elyra-ai / elyra

Elyra extends JupyterLab with an AI centric approach.
https://elyra.readthedocs.io/en/stable/
Apache License 2.0
1.85k stars 342 forks source link

bootstrapper cannot handle input with wildcards #995

Open ptitzler opened 4 years ago

ptitzler commented 4 years ago

version 1.3

Result:

[I 23:08:27.715] 'test_load_viz-1023160715':'Data_Viz' - downloaded Data_Viz-2289c970-b214-418b-bf93-68d880326eb0.tar.gz from bucket: pipeline-artifacts, object: test_load_viz-1023160715/Data_Viz-2289c970-b214-418b-bf93-68d880326eb0.tar.gz (0.042 secs)
Traceback (most recent call last):
  File "bootstrapper.py", line 402, in <module>
    main()
  File "bootstrapper.py", line 393, in main
    file_op.process_dependencies()
  File "bootstrapper.py", line 97, in process_dependencies
    self.get_file_from_object_storage(file.strip())
  File "bootstrapper.py", line 137, in get_file_from_object_storage
    self.cos_client.fget_object(bucket_name=self.cos_bucket,
  File "/usr/local/lib/python3.8/site-packages/minio/api.py", line 719, in fget_object
    stat = self.stat_object(bucket_name, object_name, sse)
  File "/usr/local/lib/python3.8/site-packages/minio/api.py", line 1138, in stat_object
    response = self._url_open('HEAD', bucket_name=bucket_name,
  File "/usr/local/lib/python3.8/site-packages/minio/api.py", line 2017, in _url_open
    raise ResponseError(response,
minio.error.NoSuchKey: NoSuchKey: message: The specified key does not exist.

The corresponding lines of code are

        if inputs:
            input_list = inputs.split(INOUT_SEPARATOR)
            for file in input_list:
                self.get_file_from_object_storage(file.strip()).     <------ FAIL

To figure out which input file couldn't be processed I had to export the pipeline and inspect the generated bootstrapper script:

- name: data-viz
    container:
      args: ['mkdir -p ./jupyter-work-dir/ && cd ./jupyter-work-dir/ && curl -H "Cache-Control:
          no-cache" -L https://raw.githubusercontent.com/elyra-ai/kfp-notebook/v0.13.0/etc/docker-scripts/bootstrapper.py
          --output bootstrapper.py && curl -H "Cache-Control: no-cache" -L https://raw.githubusercontent.com/elyra-ai/kfp-notebook/v0.13.0/etc/requirements-elyra.txt
          --output requirements-elyra.txt && python3 -m pip install  packaging &&
          python3 -m pip freeze > requirements-current.txt && python3 bootstrapper.py
          --cos-endpoint http://devises1.fyre.ibm.com:31323 --cos-bucket pipeline-artifacts
          --cos-directory "test_load_viz" --cos-dependencies-archive "Data_Viz-2289c970-b214-418b-bf93-68d880326eb0.tar.gz"
          --file "XAI/Elyra Build/Data_Viz.ipynb" --inputs "data/bank-additional/*" ']

Input is set to data/bank-additional/*, which seems to cause the failure.

Issues:

ptitzler commented 4 years ago

Temporary workaround: don't use wildcards in output file declarations.