tensorflow / tfx

TFX is an end-to-end platform for deploying production ML pipelines
https://tensorflow.org/tfx
Apache License 2.0
2.11k stars 706 forks source link

Building a TFX Pipeline Locally - Failure to create directory #3993

Closed DuncBegg closed 3 years ago

DuncBegg commented 3 years ago

System information

Environment: Windows 10 Home Tensorflow Version: 2.4.2 TFX version: 0.30.0 Python version: 3.8.10 IDE: Visual Studio Code

C:\Users\dunca>pip freeze absl-py==0.12.0 apache-beam==2.30.0 argon2-cffi==20.1.0 astunparse==1.6.3 async-generator==1.10 attrs==20.3.0 avro-python3==1.9.2.1 backcall==0.2.0 bleach==3.3.0 cachetools==4.2.2 certifi==2021.5.30 cffi==1.14.5 chardet==4.0.0 click==7.1.2 colorama==0.4.4 crcmod==1.7 cycler==0.10.0 decorator==5.0.9 defusedxml==0.7.1 dill==0.3.1.1 docker==4.4.4 docopt==0.6.2 entrypoints==0.3 fastavro==1.4.1 fasteners==0.16.3 flatbuffers==1.12 future==0.18.2 gast==0.3.3 google-api-core==1.30.0 google-api-python-client==1.12.8 google-apitools==0.5.31 google-auth==1.32.0 google-auth-httplib2==0.1.0 google-auth-oauthlib==0.4.4 google-cloud-aiplatform==0.7.1 google-cloud-bigquery==2.20.0 google-cloud-bigtable==1.7.0 google-cloud-core==1.7.0 google-cloud-datastore==1.15.3 google-cloud-dlp==1.0.0 google-cloud-language==1.3.0 google-cloud-pubsub==1.7.0 google-cloud-spanner==1.19.1 google-cloud-storage==1.39.0 google-cloud-videointelligence==1.16.1 google-cloud-vision==1.0.0 google-crc32c==1.1.2 google-pasta==0.2.0 google-resumable-media==1.3.1 googleapis-common-protos==1.53.0 grpc-google-iam-v1==0.12.3 grpcio==1.32.0 grpcio-gcp==0.2.2 h5py==2.10.0 hdfs==2.6.0 httplib2==0.19.1 idna==2.10 ipykernel==5.5.5 ipython==7.24.1 ipython-genutils==0.2.0 ipywidgets==7.6.3 jedi==0.18.0 Jinja2==2.11.3 joblib==0.14.1 jsonschema==3.2.0 jupyter-client==6.1.12 jupyter-core==4.7.1 jupyterlab-pygments==0.1.2 jupyterlab-widgets==1.0.0 keras-nightly==2.5.0.dev2021032900 Keras-Preprocessing==1.1.2 keras-tuner==1.0.1 kiwisolver==1.3.1 kubernetes==11.0.0 Markdown==3.3.4 MarkupSafe==2.0.1 matplotlib==3.4.2 matplotlib-inline==0.1.2 mistune==0.8.4 ml-metadata==0.30.0 ml-pipelines-sdk==0.30.0 nbclient==0.5.3 nbconvert==6.1.0 nbformat==5.1.3 nest-asyncio==1.5.1 notebook==6.4.0 numpy==1.19.5 oauth2client==4.1.3 oauthlib==3.1.1 opt-einsum==3.3.0 packaging==20.9 pandas==1.2.4 pandocfilters==1.4.3 parso==0.8.2 pathlib==1.0.1 pickleshare==0.7.5 Pillow==8.2.0 portpicker==1.4.0 prometheus-client==0.11.0 prompt-toolkit==3.0.19 proto-plus==1.18.1 protobuf==3.17.3 pyarrow==2.0.0 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycparser==2.20 pydot==1.4.2 Pygments==2.9.0 pymongo==3.11.4 pyparsing==2.4.7 pyrsistent==0.17.3 python-dateutil==2.8.1 pytz==2021.1 pywin32==227 pywinpty==1.1.3 PyYAML==5.4.1 pyzmq==22.1.0 requests==2.25.1 requests-oauthlib==1.3.0 rsa==4.7.2 scikit-learn==0.24.2 scipy==1.7.0 seaborn==0.11.1 Send2Trash==1.7.1 six==1.15.0 sklearn==0.0 tabulate==0.8.9 tensorboard==2.5.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.0 tensorflow==2.4.2 tensorflow-data-validation==0.30.0 tensorflow-estimator==2.4.0 tensorflow-hub==0.9.0 tensorflow-metadata==0.30.0 tensorflow-model-analysis==0.30.0 tensorflow-serving-api==2.4.1 tensorflow-transform==0.30.0 termcolor==1.1.0 terminado==0.10.1 terminaltables==3.1.0 testpath==0.5.0 tfx==0.30.0 tfx-bsl==0.30.0 threadpoolctl==2.1.0 tornado==6.1 tqdm==4.61.1 traitlets==5.0.5 typing-extensions==3.7.4.3 uritemplate==3.0.1 urllib3==1.26.5 wcwidth==0.2.5 webencodings==0.5.1 websocket-client==1.1.0 Werkzeug==2.0.1 widgetsnbextension==3.5.1 wrapt==1.12.1

Describe the current behavior

I have been stepping though the TFX guide: https://www.tensorflow.org/tfx/guide/build_local_pipeline

Per the guide - I go about executing this instruction: _tfx run create --pipeline_name pipelinename

An exception is then thrown: ERROR:absl:Failed to make stateful working dir: .\tfx_pipeline_output\asx_pipeline\CsvExampleGen.system\stateful_working_dir\2021-07-01T16:10:04.035708 Traceback (most recent call last): File "C:\Users\dunca\AppData\Local\Programs\Python\Python38\lib\site-packages\tfx\orchestration\portable\outputs_utils.py", line 211, in get_stateful_working_directory fileio.makedirs(stateful_working_dir) File "C:\Users\dunca\AppData\Local\Programs\Python\Python38\lib\site-packages\tfx\dsl\io\fileio.py", line 83, in makedirs _get_filesystem(path).makedirs(path) File "C:\Users\dunca\AppData\Local\Programs\Python\Python38\lib\site-packages\tfx\dsl\io\plugins\tensorflow_gfile.py", line 76, in makedirs tf.io.gfile.makedirs(path) File "C:\Users\dunca\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 483, in recursive_create_dir_v2 _pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Failed to create a directory: .\tfx_pipeline_output\asx_pipeline\CsvExampleGen.system\stateful_working_dir/2021-07-01T16:10:04.035708; Invalid argument

The same error is raised if you step into this code:

local_runner.py:

def run(): """Define a pipeline."""

local_dag_runner.LocalDagRunner().run( pipeline.create_pipeline( pipeline_name=configs.PIPELINE_NAME, pipeline_root=PIPELINE_ROOT, data_path=DATA_PATH,

NOTE: Use query instead of data_path to use BigQueryExampleGen.

      # query=configs.BIG_QUERY_QUERY,
      preprocessing_fn=configs.PREPROCESSING_FN,
      run_fn=configs.RUN_FN,
      train_args=trainer_pb2.TrainArgs(num_steps=configs.TRAIN_NUM_STEPS),
      eval_args=trainer_pb2.EvalArgs(num_steps=configs.EVAL_NUM_STEPS),
      eval_accuracy_threshold=configs.EVAL_ACCURACY_THRESHOLD,
      serving_model_dir=SERVING_MODEL_DIR,
      # NOTE: Provide GCP configs to use BigQuery with Beam DirectRunner.
      # beam_pipeline_args=configs.
      # BIG_QUERY_WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS,
      metadata_connection_config=metadata.sqlite_metadata_connection_config(
          METADATA_PATH)))

It looks like the directory it's attempting to create has a mixture of forward and back slashes. Feels like a unix versus windows directory issue.

Other info / logs Sorry if this is a sub standard bug raise ... it's my first.

arghyaganguly commented 3 years ago

Hi @DuncBegg , this is due to glob/regex pattern difference between linux/windows as mentioned in link. TFX currently supports(default) :- linux,macos

DuncBegg commented 3 years ago

Thanks @arghyaganguly. That's unfortunate about Windows support. I didnt see that explicitly mentioned in the documentation. I suppose I'll look into trying to set up a Linux docker container and see if i can get that to work.

arghyaganguly commented 3 years ago

@DuncBegg , please confirm if this can be closed.Thanks.

DuncBegg commented 3 years ago

Sure. please close

google-ml-butler[bot] commented 3 years ago

Are you satisfied with the resolution of your issue? Yes No