Building-ML-Pipelines / building-machine-learning-pipelines

Code repository for the O'Reilly publication "Building Machine Learning Pipelines" by Hannes Hapke & Catherine Nelson
MIT License
585 stars 250 forks source link

Chapter 5 Data preprocessing #20

Closed tommy2k0 closed 2 years ago

tommy2k0 commented 4 years ago

Operating system: windows 10 tensorflow v2.2.0 tfx v0.22.0

Simply running the interactive_pipeline.ipynb jupyter notebook in the repo, the code fails at the data transform section when trying to run the preprocessing_fn in module.py and produces the following error:

RuntimeError: FileNotFoundError: [Errno 2] No such file or directory: 'C:\blmp\tfx\Transform\transform_graph\13\.temp_path\tftransform_tmp\beam-temp-vocab_compute_and_apply_vocabulary_vocabulary-47fb6d70f05611eab03c7ce9d3b592e5\007c4e2d-3739-412a-ae5a-808d1283096e.vocab_compute_and_apply_vocabulary_vocabulary' [while running 'Analyze/VocabularyOrderAndWrite[compute_and_apply_vocabulary/vocabulary]/WriteToFile/Write/WriteImpl/WriteBundles']

The file 007c4e2d-3739-412a-ae5a-808d1283096e.vocab_compute_and_apply_vocabulary_vocabulary doesn't get generated which is probably why it fails.

catherinenelson1 commented 3 years ago

Hi @tommy2k0, I am unable to reproduce this error. If you try running the pipeline again, does the error persist? Occasionally the InteractiveContext becomes corrupted and you will need to delete the tfx folder and restart the pipeline (this behavior doesn't happen when using an orchestrator such as Beam, Airflow or Kubeflow Pipelines).

If you are still getting the error, it may be related to using Windows. I don't have access to a Windows machine to test, so please could you check that you can run this example locally: https://www.tensorflow.org/tfx/tutorials/tfx/components_keras

MERCHA commented 3 years ago

I get the same error as @tommy2k0 and this is may be related to using Windows because it work fine on Linux