Closed nroberts1 closed 2 years ago
In your dsl.pipeline, is metadata_connection_config still set?
Here is an example we have
Am facing the same error using a variant of the bigquery_ml example. In my variant, import statements and most code in kubeflow_dag_runner.py & the pipeline-creation function are the same as in the example. There are a differences for the ExampleGen component, which is reading an existing BQ table rather than the usual Taxi upload. I've also changed the file structure a bit, based on structures used in the 0.26.0 documentation. I also used "pip freeze > requirements.txt" from within the tensorflow/tfx:1.0.0
image and made sure my environment matched versions of TFX, tensorboard, requests, and kfp-pipeline-spec.
(I'm trying to verify what other packages have substantial differences, but those seemed like the main ones.)
Perhaps #4179 is relevant here too.
Thanks to @1025KB's linked example I was able to get it working. I believe the change was from using:
tfx.ochestration.experimental.kubeflow...
rather than:
tfx.orchestration.kubeflow...
So I now have:
metadata_config = (tfx.orchestration.experimental.get_default_kubeflow_metadata_config())
runner_config = tfx.orchestration.experimental.KubeflowDagRunnerConfig(
kubeflow_metadata_config=metadata_config,
tfx_image=BASE_IMAGE)
Not convinced this doesn't mean there's a problem as I'd imagine the non experimental version should work, but this did get it working okay. Hope that helps you @SMesser
Thanks @nroberts1 , but I still have the same errors. The first variant I tried was:
from tfx import v1 as tfx
# Then reference the following
tfx.orchestration.experimental.get_default_kubeflow_metadata_config
tfx.orchestration.experimental.KubeflowDagRunner
tfx.orchestration.experimental.KubeflowDagRunnerConfig
The above and the first of the following both still gave the "metadata_connection_config is expected to be in type ..." error. Other things I tried:
from tfx.v1.orchestration.experimental import get_default_kubeflow_metadata_config, KubeflowDagRunner, KubeflowDagRunnerConfig
# Source code also mentions a couple "V2" classes, and the setup of them had to be tweaked in a couple obvious ways,
# but I eventually got the error "Cannot find KubeflowDagRunner.run() in kubeflow_dag_runner.py()
# so I assume these are not yet part of the public API
from tfx.v1.orchestration.experimental import get_default_kubeflow_metadata_config, KubeflowV2DagRunner, KubeflowV2DagRunnerConfig
# The following are all import errors
from tfx.orchestration.experimental import get_default_kubeflow_metadata_config
from tfx.orchestration.experimental.kubeflow import get_detault_kubeflow_metadata_config
from tfx.v1.orchestration.experimental.kubeflow import get_default_kubeflow_metadata_config, KubeflowDagRunner, KubeflowDagRunnerConfig
I also tried replacing tfx.orchestration.pipeline.Pipeline
with tfx.v1.dsl.Pipeline
as per @1025KB 's suggestion. No joy.
I have used this approach since TFX 0.24 and it still works in TFX 1.2.0:
from tfx.orchestration.kubeflow import kubeflow_dag_runner
...
metadata_config = kubeflow_dag_runner.get_default_kubeflow_metadata_config()
# Default values when running KFP in KubeFlow
metadata_config.grpc_config.grpc_service_host.value = 'metadata-grpc-service.kubeflow'
metadata_config.grpc_config.grpc_service_port.value = '8080'
runner_config = kubeflow_dag_runner.KubeflowDagRunnerConfig(
kubeflow_metadata_config=metadata_config,
tfx_image=<IMAGE>
pipeline_operator_funcs=(...)
)
kubeflow_dag_runner.KubeflowDagRunner(config=runner_config).run(p.create_pipeline())
@ConverJens How are you running the pipeline? I've been using a Bash script to set a bunch of environment variables and combine the "tfx pipeline create" and "tfx run create" commands from the TFX CLI. I've tried a bunch of alternate ways of setting up the pipeline itself, and can in some cases exchange the metadata-config-is-wrong-type error for a missing-required-arguments error, but both fail at container launch no matter what other changes I've made to the code. I've tried your customization of metadata config, alternate import statements for the Pipeline, TrainArgs, EvalArgs, SplitConfig, KubeflowDagRunner, KubeflowDagRunnerConfig and other classes / functions, and yet the only working example I've found was running on another VM and bypassing TFX CLI. I've made attempts in two different GCP environments. I've tried a couple of TFX's examples but none of those have worked for me either, despite the pipeline working fine under TFX 0.26.0.
My invocation is
tfx pipeline create --pipeline_path kubeflow_dag_runner.py --endpoint deadbeef0123456-hash-east1.pipelines.googleusercontent.com --build_image --engine kubeflow && tfx run create --pipeline_name test_run_666 --engine kubeflow --endpoint deadbeef0123456-hash-east1.pipelines.googleusercontent.com
@SMesser That's probably the reason: I don't use the TFX CLI, I never liked it. Instead, I have automated the compile and upload to KubeFlow in our CI/CD flow along with building a base image to use for the pipeline. Once the pipeline is available in KF, I usually trigger it remotely by calling the KFP API via REST.
Other than that, I have no idea as to why it doesn't work for you. Can you post the error message you get? It seems as you are running in GCP, right? Maybe your metadata grpc service has another path/port?
When I do a slightly-modified version of the bigquery_ml example, I get this error. Note this is the entire log from the Kubeflow UI. This appears at each component if I've got multiple independent initial components:
2021-08-24 20:56:19.571778: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 470, in <module>
main()
File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 428, in main
deployment_config = runner_utils.extract_local_deployment_config(tfx_ir)
File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/local/runner_utils.py", line 39, in extract_local_deployment_config
return _to_local_deployment(result)
File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/local/runner_utils.py", line 99, in _to_local_deployment
input_config.metadata_connection_config.type_url))
ValueError: metadata_connection_config is expected to be in type ml_metadata.ConnectionConfig, but got type type.googleapis.com/tfx.orchestration.kubeflow.proto.KubeflowMetadataConfig
That's the error message which got me involved in this issue. I've gotten other error messages under different conditions, but am only now trying alternate ways of running the pipeline. (Nothing explicitly says it's the CLI itself which is failing, so I just assumed my code was bad...) The following are less relevant to the specific error message, but maybe they're relevant to debugging the CLI or identifying oddities in my setup.
If I go for a direct implementation of the bigquery_ml example, instead of making changes to reference our DB and file structure for things like the preprocessing_fn(), I get this instead:
2WARNING:absl:metadata_connection_config is not provided by IR.
3INFO:absl:tensorflow_ranking is not available: No module named 'tensorflow_ranking'
4INFO:absl:tensorflow_text is not available: No module named 'tensorflow_text'
5INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
6INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
7INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
8INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
9INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
10INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
11INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
12INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
13INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
14INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
15INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
16INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
17INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
18INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
19INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
20INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
21INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
22INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
23INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
24INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
25INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
26INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
27INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
28INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
29INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
30INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
31INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
32INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
33INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
34INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
35INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
36INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
37INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
38INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
39INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
40INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
41INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
42INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
43INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
44INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
45INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
46INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
47INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
48INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
49INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
50INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
51INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
52INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
53INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
54INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
55INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
56INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
57INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
58INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
59INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
60INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
61INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
62INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
63INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
64INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
65INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
66INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
67INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
68INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
69INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
70INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
71INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
72INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
73INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
74INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
75INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
76INFO:apache_beam.typehints.native_type_compatibility:Using Any for unsupported type: typing.MutableMapping[str, typing.Any]
77INFO:absl:tensorflow_text is not available: No module named 'tensorflow_text'
78INFO:root:Component BigQueryExampleGen is running.
79INFO:absl:Running launcher for node_info {
80 type {
81 name: "tfx.extensions.google_cloud_big_query.example_gen.component.BigQueryExampleGen"
82 }
83 id: "BigQueryExampleGen"
84}
85contexts {
86 contexts {
87 type {
88 name: "pipeline"
89 }
90 name {
91 field_value {
92 string_value: "chicago_taxi_pipeline_kubeflow_gcp"
93 }
94 }
95 }
96 contexts {
97 type {
98 name: "pipeline_run"
99 }
100 name {
101 field_value {
102 string_value: "chicago-taxi-pipeline-kubeflow-gcp-p4pfw"
103 }
104 }
105 }
106 contexts {
107 type {
108 name: "node"
109 }
110 name {
111 field_value {
112 string_value: "chicago_taxi_pipeline_kubeflow_gcp.BigQueryExampleGen"
113 }
114 }
115 }
116}
117outputs {
118 outputs {
119 key: "examples"
120 value {
121 artifact_spec {
122 type {
123 name: "Examples"
124 properties {
125 key: "span"
126 value: INT
127 }
128 properties {
129 key: "split_names"
130 value: STRING
131 }
132 properties {
133 key: "version"
134 value: INT
135 }
136 }
137 }
138 }
139 }
140}
141parameters {
142 parameters {
143 key: "input_config"
144 value {
145 field_value {
146 string_value: "{\n \"splits\": [\n {\n \"name\": \"single_split\",\n \"pattern\": \"\\n SELECT\\n IFNULL(pickup_community_area, 0) as pickup_community_area,\\n fare,\\n EXTRACT(MONTH FROM trip_start_timestamp) AS trip_start_month,\\n EXTRACT(HOUR FROM trip_start_timestamp) AS trip_start_hour,\\n EXTRACT(DAYOFWEEK FROM trip_start_timestamp) AS trip_start_day,\\n UNIX_SECONDS(trip_start_timestamp) AS trip_start_timestamp,\\n IFNULL(pickup_latitude, 0) as pickup_latitude,\\n IFNULL(pickup_longitude, 0) as pickup_longitude,\\n IFNULL(dropoff_latitude, 0) as dropoff_latitude,\\n IFNULL(dropoff_longitude, 0) as dropoff_longitude,\\n trip_miles,\\n IFNULL(pickup_census_tract, 0) as pickup_census_tract,\\n IFNULL(dropoff_census_tract, 0) as dropoff_census_tract,\\n payment_type,\\n IFNULL(company, \'NA\') as company,\\n IFNULL(trip_seconds, 0) as trip_seconds,\\n IFNULL(dropoff_community_area, 0) as dropoff_community_area,\\n tips\\n FROM `bigquery-public-data.chicago_taxi_trips.taxi_trips`\\n WHERE (ABS(FARM_FINGERPRINT(unique_key)) / 0x7FFFFFFFFFFFFFFF)\\n < 0.001\"\n }\n ]\n}"
147 }
148 }
149 }
150 parameters {
151 key: "output_config"
152 value {
153 field_value {
154 string_value: "{\n \"split_config\": {\n \"splits\": [\n {\n \"hash_buckets\": 2,\n \"name\": \"train\"\n },\n {\n \"hash_buckets\": 1,\n \"name\": \"eval\"\n }\n ]\n }\n}"
155 }
156 }
157 }
158 parameters {
159 key: "output_data_format"
160 value {
161 field_value {
162 int_value: 6
163 }
164 }
165 }
166}
167downstream_nodes: "Evaluator"
168downstream_nodes: "ModelValidator"
169downstream_nodes: "StatisticsGen"
170downstream_nodes: "Transform"
171execution_options {
172 caching_options {
173 }
174}
175
176INFO:absl:MetadataStore with gRPC connection initialized
177WARNING:absl:Conflicting properties comparing with existing metadata type with the same type name. Existing type: id: 2
178name: "pipeline"
179properties {
180 key: "pipeline_name"
181 value: STRING
182Traceback (most recent call last):
183 File "/opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py", line 199, in _call_method
184 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec))
185 File "/opt/conda/lib/python3.7/site-packages/grpc/_channel.py", line 923, in __call__
186}
187, New type: name: "pipeline"
188
189 return _end_unary_response_blocking(state, call, False, None)
190 File "/opt/conda/lib/python3.7/site-packages/grpc/_channel.py", line 826, in _end_unary_response_blocking
191 raise _InactiveRpcError(state)
192grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
193 status = StatusCode.ALREADY_EXISTS
194 details = "Type already exists with different properties."
195 debug_error_string = "{"created":"@1629927950.896293052","description":"Error received from peer ipv4:10.83.251.115:8080","file":"src/core/lib/surface/call.cc","file_line":1062,"grpc_message":"Type already exists with different properties.","grpc_status":6}"
196>
197
198During handling of the above exception, another exception occurred:
199
200Traceback (most recent call last):
201 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/portable/mlmd/common_utils.py", line 76, in register_type_if_not_exist
202 metadata_type, can_add_fields=True, can_omit_fields=True)
203 File "/opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py", line 506, in put_context_type
204 self._call('PutContextType', request, response)
205 File "/opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py", line 174, in _call
206 return self._call_method(method_name, request, response)
207 File "/opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py", line 204, in _call_method
208 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error
209ml_metadata.errors.AlreadyExistsError: Type already exists with different properties.
210
211During handling of the above exception, another exception occurred:
212
213Traceback (most recent call last):
214 File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
215 "__main__", mod_spec)
216 File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
217 exec(code, run_globals)
218 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 470, in <module>
219 main()
220 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 462, in main
221 execution_info = component_launcher.launch()
222 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/portable/launcher.py", line 465, in launch
223 execution_preparation_result = self._prepare_execution()
224 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/portable/launcher.py", line 251, in _prepare_execution
225 metadata_handler=m, node_contexts=self._pipeline_node.contexts)
226 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/portable/mlmd/context_lib.py", line 153, in prepare_contexts
227 for context_spec in node_contexts.contexts
228 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/portable/mlmd/context_lib.py", line 153, in <listcomp>
229 for context_spec in node_contexts.contexts
230 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/portable/mlmd/context_lib.py", line 91, in _register_context_if_not_exist
231 metadata_handler=metadata_handler, context_spec=context_spec)
232 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/portable/mlmd/context_lib.py", line 47, in _generate_context_proto
233 metadata_handler, context_spec.type)
234 File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/portable/mlmd/common_utils.py", line 89, in register_type_if_not_exist
235 raise RuntimeError(warning_str)
236RuntimeError: Conflicting properties comparing with existing metadata type with the same type name. Existing type: id: 2
237name: "pipeline"
238properties {
239 key: "pipeline_name"
240 value: STRING
241}
242, New type: name: "pipeline"
If I run the code we had working in TFX 0.26, I get the
2021-08-26 17:17:47.330922: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2usage: container_entrypoint.py [-h] --pipeline_name PIPELINE_NAME
3 --pipeline_root PIPELINE_ROOT
4 --kubeflow_metadata_config
5 KUBEFLOW_METADATA_CONFIG --beam_pipeline_args
6 BEAM_PIPELINE_ARGS --additional_pipeline_args
7 ADDITIONAL_PIPELINE_ARGS
8 --component_launcher_class_path
9 COMPONENT_LAUNCHER_CLASS_PATH [--enable_cache]
10 --serialized_component SERIALIZED_COMPONENT
11 --component_config COMPONENT_CONFIG
12container_entrypoint.py: error: the following arguments are required: --pipeline_name, --beam_pipeline_args, --additional_pipeline_args, --component_launcher_class_path, --component_config
The last was an outgrowth of #2180 , and my post announcing victory there seems to have been premature - I got one working run, then it failed a few days later with no changes to the code or input parameters... so at this point I'm not even sure if I was doing a proper test of TFX 1.0 or if I ran a 0.26 version of the code by mistake, or if some versioning changed with what we have deployed for GCP / Kubeflow / ...
These errors have persisted through a lot of changes, though I can't yet account for why I get different errors in different conditions. Other contextual bits I'm having trouble controlling for or avoiding: I'm running from within a GCP AI Platform notebook. I'm using Poetry to control Python dependencies.
@SMesser I have never experienced the first error myself but from the example pipeline posted by @1025KB , I can see a difference compared to my code. The posted example uses:
tfx.dsl.Pipeline(...
while I use:
...
pipeline.Pipeline(...
to create the pipeline object that is later compiled. Apart from that, I have no idea.
The second issue on the other hand has to do with conflicting entries in MLMD which can occur for several reasons:
I recently did 2. and ended up with that error message. It appears that there has been some breaking change in the MLMD schema versions happening with the 1.0.0 release, causing rollback to prior versions to fail. I cleaned my MLMD DB and re-run the pre 1.0.0 version and it worked again.
If you are running from within a notebook, I would first try the components using the interactive runner which just runs them in the notebook. Default is using a sqlite DB but you can pass a MLMD config and access the real thing. This would allow you to run small BigQuery data sets until the components work as expected and then move onto the TFX CLI.
This is the way I have excluded bad code on my own side when the issue was really in KFP or TFXs kubeflow specific code. You might also want to try the local or beam runner.
I think I've tried all those things except running from something other than the TFX CLI... working on that now.
The Chicago Taxi example runs fine in the Colab notebook with TFX 1.2.0 but the code also seems to be updated whereas the examples here still have code like from tfx.proto import pusher_pb2
. So I think you might be able to reverse-engineer a working example from the interactive notebook.
@1025KB @rcrowe-google Can you help to coordinate relevant imports?
Given that the last comment is over a year old I'm closing this bug.
Using:
when I run the pipeline in TFX 0.30.0 the pipeline runs fine, but once I update to 1.0.0 the pipeline fails with:
ValueError: metadata_connection_config is expected to be in type ml_metadata.ConnectionConfig, but got type type.googleapis.com/tfx.orchestration.kubeflow.proto.KubeflowMetadataConfig
Stacktrace from pod's log:
Kubeflow is running on GCP
0.30.0:
1.0.0:
Obviously the BeamDagRunner uses metadata_connection_config of type ml_metadata.ConnectionConfig as this seems to work fine in both 0.30.0 and 1.0.0.