A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
When reading in TFRecords from s3 which have a space in the name, I encounter the error shown in the log output below. Changing spaces to either %20 or + and escaping (\) all produce the same error.
Traceback (most recent call last):
File "/home/ubuntu/test.py", line 348, in <module>
train_loader = DALIGenericIterator(train_pipe, ["a", "b"],
File "/home/ubuntu/.venv/lib/python3.10/site-packages/nvidia/dali/plugin/pytorch/__init__.py", line 224, in __init__
self._first_batch = DALIGenericIterator.__next__(self)
File "/home/ubuntu/.venv/lib/python3.10/site-packages/nvidia/dali/plugin/pytorch/__init__.py", line 239, in __next__
outputs = self._get_outputs()
File "/home/ubuntu/.venv/lib/python3.10/site-packages/nvidia/dali/plugin/base_iterator.py", line 385, in _get_outputs
outputs.append(p.share_outputs())
File "/home/ubuntu/.venv/lib/python3.10/site-packages/nvidia/dali/pipeline.py", line 1160, in share_outputs
return self._pipe.ShareOutputs()
RuntimeError: Critical error in pipeline:
Error in CPU operator `nvidia.dali.fn.readers.tfrecord`,
which was used in the pipeline definition with the following traceback:
File "/home/ubuntu/test.py", line 42, in create_dali_pipeline
inputs = fn.readers.tfrecord(path=tfrecord_paths,
encountered:
s3://bucket/path/my data.tfrecord is not a valid URI: Invalid character found ( ) in path
Current pipeline object is no longer valid.
Other/Misc.
No response
Check for duplicates
[X] I have searched the open bugs/issues and have found no duplicates for this bug report
Version
1.38
Describe the bug.
When reading in TFRecords from s3 which have a space in the name, I encounter the error shown in the log output below. Changing spaces to either
%20
or+
and escaping (\
) all produce the same error.Minimum reproducible example
Relevant log output
Other/Misc.
No response
Check for duplicates