Building-ML-Pipelines / building-machine-learning-pipelines

Code repository for the O'Reilly publication "Building Machine Learning Pipelines" by Hannes Hapke & Catherine Nelson
MIT License
583 stars 250 forks source link

context.run(statistics_gen) interactive pipeline throwing error #48

Closed shazamkash closed 3 years ago

shazamkash commented 3 years ago

Docker container running the following image tensorflow/tensorflow:2.2.1-gpu-py3-jupyter tensorflow==2.2.1 tfx==0.22.0

Running the interactive pipeline notebook throws the following error when trying to run statistics gen

ERROR TypeCheckError: Type hint violation for 'ToTopKTuples': requires FrozenSet[FeaturePath] but got Set[Any] for bytes_features Full type hint: IOTypeHints[inputs=((Tuple[Union[NoneType, bytes, str], RecordBatch], FrozenSet[FeaturePath], FrozenSet[FeaturePath], Union[NoneType, str]), {}), outputs=((Tuple[Tuple[Union[NoneType, bytes, str], Tuple[Union[bytes, str], ...], Any], Union[Tuple[int, Union[float, int]], int]],), {})] strip_iterable()

based on: IOTypeHints[inputs=((Tuple[Union[NoneType, bytes, str], RecordBatch], FrozenSet[FeaturePath], FrozenSet[FeaturePath], Union[NoneType, str]), {}), outputs=((Iterable[Tuple[Tuple[Union[NoneType, bytes, str], Tuple[Union[bytes, str], ...], Any], Union[Tuple[int, Union[float, int]], int]]],), {})] from_callable(_to_topk_tuples) signature: (sliced_record_batch:Tuple[Union[str, bytes, NoneType], pyarrow.lib.RecordBatch], bytes_features:FrozenSet[tensorflow_data_validation.types.FeaturePath], categorical_features:FrozenSet[tensorflow_data_validation.types.FeaturePath], weight_feature:Union[str, NoneType]) -> Iterable[Tuple[Tuple[Union[str, bytes, NoneType], Tuple[Union[bytes, str], ...], Any], Union[int, Tuple[int, Union[int, float]]]]] File "/usr/local/lib/python3.6/dist-packages/tensorflow_data_validation/statistics/generators/top_k_uniques_stats_generator.py", line 202

shazamkash commented 3 years ago

I was able to resolve the above error by using the latest versions of tensorflow and tfx. tensorflow==2.4.1 tfx==0.27.0