Building-ML-Pipelines / building-machine-learning-pipelines

Code repository for the O'Reilly publication "Building Machine Learning Pipelines" by Hannes Hapke & Catherine Nelson
MIT License
583 stars 250 forks source link

Kubeflow pipeline example in chapter.12 is not working #36

Closed jazzsir closed 3 years ago

jazzsir commented 3 years ago

Thank you for reporting an issue!

If you want to report an issue with the code in this repository, please provide the following information:

WARNING:absl:Could not find matching artifact class for type 'Examples' (proto: 'name: "Examples"\nproperties {\n  key: "span"\n  value: INT\n}\nproperties {\n  key: "split_names"\n  value: STRING\n}\nproperties {\n  key: "version"\n  value: INT\n}\n'); generating an ephemeral artifact class on-the-fly. If this is not intended, please make sure that the artifact class for this type can be imported within your container or environment where a component is executed to consume this type.
INFO:absl:Running driver for CsvExampleGen
INFO:absl:MetadataStore with gRPC connection initialized
INFO:absl:Adding KFP pod name consumer-complaint-pipeline-kubeflow-cd8hd-1778682510 to execution
INFO:absl:Running executor for CsvExampleGen
INFO:absl:Attempting to infer TFX Python dependency for beam
INFO:absl:Copying all content from install dir /tfx-src/tfx to temp dir /tmp/tmpysdjeff4/build/tfx
INFO:absl:Generating a temp setup file at /tmp/tmpysdjeff4/build/tfx/setup.py
INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpysdjeff4/build/tfx/setup.log
INFO:absl:Added --extra_package=/tmp/tmpysdjeff4/build/tfx/dist/tfx_ephemeral-0.22.0.tar.gz to beam args
INFO:absl:Generating examples.
INFO:absl:Using 10 process(es) for Beam pipeline execution.
Traceback (most recent call last):
  File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 360, in <module>
    main()
  File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 353, in main
    execution_info = launcher.launch()
  File "/tfx-src/tfx/orchestration/launcher/base_component_launcher.py", line 205, in launch
    execution_decision.exec_properties)
  File "/tfx-src/tfx/orchestration/launcher/in_process_component_launcher.py", line 67, in _run_executor
    executor.Do(input_dict, output_dict, exec_properties)
  File "/tfx-src/tfx/components/example_gen/base_example_gen_executor.py", line 234, in Do
    exec_properties)
  File "/tfx-src/tfx/components/example_gen/base_example_gen_executor.py", line 193, in GenerateExamplesByBeam
    | 'SplitData' >> beam.Partition(_PartitionFn, len(buckets), buckets))
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/transforms/ptransform.py", line 998, in __ror__
    return self.transform.__ror__(pvalueish, self.label)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/transforms/ptransform.py", line 562, in __ror__
    result = p.apply(self, pvalueish, label)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/pipeline.py", line 612, in apply
    return self.apply(transform, pvalueish)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/pipeline.py", line 655, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/runners/runner.py", line 198, in apply
    return m(transform, input, options)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/runners/runner.py", line 228, in apply_PTransform
    return transform.expand(input)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/transforms/ptransform.py", line 923, in expand
    return self._fn(pcoll, *args, **kwargs)
  File "/tfx-src/tfx/components/example_gen/base_example_gen_executor.py", line 86, in _InputToSerializedExample
    | 'SerializeDeterministically' >>
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/pvalue.py", line 140, in __or__
    return self.pipeline.apply(ptransform, self)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/pipeline.py", line 602, in apply
    transform.transform, pvalueish, label or transform.label)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/pipeline.py", line 612, in apply
    return self.apply(transform, pvalueish)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/pipeline.py", line 655, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/runners/runner.py", line 198, in apply
    return m(transform, input, options)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/runners/runner.py", line 228, in apply_PTransform
    return transform.expand(input)
  File "/opt/venv/lib/python3.6/site-packages/apache_beam/transforms/ptransform.py", line 923, in expand
    return self._fn(pcoll, *args, **kwargs)
  File "/tfx-src/tfx/components/example_gen/csv_example_gen/executor.py", line 118, in _CsvToExample
    input_base_uri = artifact_utils.get_single_uri(input_dict[INPUT_KEY])
KeyError: 'input'

If you found an error in the book, please report it at https://www.oreilly.com/catalog/errata.csp?isbn=0636920260912.

jazzsir commented 3 years ago

it works, after removing output_config in CsvExampleGen()