Building-ML-Pipelines / building-machine-learning-pipelines

Code repository for the O'Reilly publication "Building Machine Learning Pipelines" by Hannes Hapke & Catherine Nelson
MIT License
583 stars 250 forks source link

Kubeflow pipeline example in chapter.12 is not working #37

Closed jazzsir closed 3 years ago

jazzsir commented 3 years ago

Thank you for reporting an issue!

If you want to report an issue with the code in this repository, please provide the following information:

2020-10-09 13:45:14.925525: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib
2020-10-09 13:45:14.925569: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
INFO:absl:Running driver for StatisticsGen
INFO:absl:MetadataStore with gRPC connection initialized
INFO:absl:Adding KFP pod name consumer-complaint-pipeline-kubeflow-hzrjh-887385430 to execution
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/ml_metadata/metadata_store/metadata_store.py", line 171, in _call_method
    response.CopyFrom(grpc_method(request))
  File "/usr/local/lib/python3.7/dist-packages/grpc/_channel.py", line 826, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/lib/python3.7/dist-packages/grpc/_channel.py", line 729, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.ALREADY_EXISTS
    details = "Type already exists with different properties."
    debug_error_string = "{"created":"@1602251118.007945649","description":"Error received from peer ipv4:10.106.131.168:8080","file":"src/core/lib/surface/call.cc","file_line":1061,"grpc_message":"Type already exists with different properties.","grpc_status":6}"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 360, in <module>
    main()
  File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 353, in main
    execution_info = launcher.launch()
  File "/tfx-src/tfx/orchestration/launcher/base_component_launcher.py", line 197, in launch
    self._exec_properties)
  File "/tfx-src/tfx/orchestration/launcher/base_component_launcher.py", line 166, in _run_driver
    component_info=self._component_info)
  File "/tfx-src/tfx/components/base/base_driver.py", line 330, in pre_execution
    contexts=contexts)
  File "/tfx-src/tfx/orchestration/metadata.py", line 599, in update_execution
    registered_artifacts_ids=registered_output_artifact_ids))
  File "/tfx-src/tfx/orchestration/metadata.py", line 538, in _artifact_and_event_pairs
    a.set_mlmd_artifact_type(self._prepare_artifact_type(a.artifact_type))
  File "/tfx-src/tfx/orchestration/metadata.py", line 185, in _prepare_artifact_type
    artifact_type=artifact_type, can_add_fields=True)
  File "/usr/local/lib/python3.7/dist-packages/ml_metadata/metadata_store/metadata_store.py", line 282, in put_artifact_type
    self._call('PutArtifactType', request, response)
  File "/usr/local/lib/python3.7/dist-packages/ml_metadata/metadata_store/metadata_store.py", line 146, in _call
    return self._call_method(method_name, request, response)
  File "/usr/local/lib/python3.7/dist-packages/ml_metadata/metadata_store/metadata_store.py", line 176, in _call_method
    raise _make_exception(e.details(), e.code().value[0])  # pytype: disable=attribute-error
ml_metadata.errors.AlreadyExistsError: Type already exists with different properties.

If you found an error in the book, please report it at https://www.oreilly.com/catalog/errata.csp?isbn=0636920260912.

jazzsir commented 3 years ago

This is because MLMD doesn't support different TFX versions, resolved by reinstalling kubeflow metadata.

hanneshapke commented 3 years ago

Hi @jazzsir thank you for pointing out the dependency on the exact same TFX version!