Closed awerchniak closed 6 months ago
@awerchniak There is a new connector support for S3: Amazon S3 Connector for PyTorch. Can you check it out and confirm if that satisfies your usecase?
I have the same requirement as the OP.
I'm using a code that utilizes a library, which in turn uses S3FileLoader
[from torchdata.datapipes.iter import S3FileLoader
]
I'm having the same issue, and I checked out Amazon S3 Connector, however I think the API for S3 Connector is not the same as S3FileLoader
, and this will lead us to spend more time to understand the library code and make changes everywhere.
Can you please make the API compatible so that we can replace calls of S3FileLoader
, or build torchdata with S3 in the base containers?
Im using the following container: 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.0.1-gpu-py310
@rohit901 PyTorch 2.0.1 is now out of support. We recommend upgrading to later version of PyTorch containers. See available_images.md for more information.
As of Nov, 2023 torchdata has paused development and its latest compatible version is PyTorch 2.1.1. We strongly recommend Amazon S3 Connector for PyTorch instead.
Checklist
Concise Description: When attempting to launch a
sagemaker.pytorch.estimator.PyTorch.fit
job on the below-listed container that makes use of S3 IO datapipes, it fails immediately with:The specific function we want to use is
torchdata.datapipes.iter.S3FileLoader
. The error occurs because the distribution oftorchdata
included in the image was not compiled withBUILD_S3=1
. See full instructions here.DLC image/dockerfile:
763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.0.1-gpu-py310-cu118-ubuntu20.04-sagemaker
Is your feature request related to a problem? Please describe. Cannot use
S3FileLoader
with the PyTorch SageMaker base imageDescribe the solution you'd like Could you please set
BUILD_S3=1
when compilingtorchdata
, so that users can use this feature? Given that it's an AWS product offering, it's a good idea to encourage use of it for the SageMaker use case.Describe alternatives you've considered Users can manually uninstall and re-install
torchdata
, but this requires the user to understand how to optimize the install for the specific platform.Additional context N/A