Closed MatthewCaseres closed 1 year ago
@ydaiming please take a look
This SSL CA cert error
ValueError: curlCode: 77, Problem with the SSL CA cert (path? access rights?)
is very likely to be resolvable with the following command to provide the correct certificate at the desired directory:
mkdir -p /etc/pki/tls/certs && cp /etc/ssl/certs/ca-certificates.crt /etc/pki/tls/certs/ca-bundle.crt
That resolves the SSL issue, thanks!
@ydaiming
Thank you for providing the solution for users.
I'm curious why boto3
would work without certificate? It would be good in terms of UX if we can also provide some workaround for users without explicitly creating certificate at the desired directory.
My experience so far is that the Pytorch Deep Learning AMI has friction when using torchdata, maybe work to make sure that this EC2 setup is working out of the box. I use EC2 instances as a development environment.
It's really a deficiency of torchdata
. It seems /etc/pki/tls/certs/ca-bundle.crt
is the default location on RedHat, but on Debian/Ubuntu it's /etc/ssl/certs/ca-certificates.crt
. As I understand, S3Handler.cpp
should really configure the AWS client properly depending on the system it's running on, see a related discussion at https://github.com/aws/aws-sdk-cpp/issues/1863.
Shouldn't this be reopened, @ydaiming?
@MatthewCaseres @diggerk
Would this tutorial help you as we do have fsspec
as an alternative path to load from S3?
https://pytorch.org/data/beta/tutorial.html#working-with-cloud-storage-providers
This SSL CA cert error
ValueError: curlCode: 77, Problem with the SSL CA cert (path? access rights?)
is very likely to be resolvable with the following command to provide the correct certificate at the desired directory:
mkdir -p /etc/pki/tls/certs && cp /etc/ssl/certs/ca-certificates.crt /etc/pki/tls/certs/ca-bundle.crt
Providing the certificate in my notebook solved the 'ValueError: curlCode: 77' error I was having, but now I am having this Access Denied error seen below...
ValueError Traceback (most recent call last)
/tmp/ipykernel_28/311665444.py in <module>
----> 1 list(testdata_pipes)
/opt/conda/lib/python3.7/site-packages/torch/utils/data/datapipes/_hook_iterator.py in wrap_generator(*args, **kwargs)
171 response = gen.send(None)
172 else:
--> 173 response = gen.send(None)
174
175 while True:
/opt/conda/lib/python3.7/site-packages/torchdata/datapipes/iter/load/s3io.py in __iter__(self)
61 for prefix in self.source_datapipe:
62 while True:
---> 63 urls = self.handler.list_files(prefix)
64 yield from urls
65 if not urls:
ValueError: Access Denied
This exception is thrown by __iter__ of S3FileListerIterDataPipe()
I am using a kaggle notebook and trying to obtain images from an S3 bucket (s3://drivendata-competition-biomassters-public-us)
I get the kaggle notebook running by importing my packages and using the following codes:
>> !unzip awscliv2.zip
>> !./aws/install
Then configure my access keys by doing the following:
>> !aws configure set aws_access_key_id 'XXXXXXXXXXXXXXXXXXXX'
>> !aws configure set aws_secret_access_key 'XXXXXXXXXXXXXXXXXX'
>> !aws s3 ls s3://drivendata-competition-biomassters-public-us --no-sign-request
By doing all this I am able to see the folders and files in the S3 bucket, however I cannot use the S3filelister, when I run the code below.
>> s3_prefixes = IterableWrapper(['s3://drivendata-competition-biomassters-public-us/train_agbm/0003d2eb_agbm.tif',
's3://drivendata-competition-biomassters-public-us/train_agbm/000aa810_agbm.tif',
's3://drivendata-competition-biomassters-public-us/train_agbm/000d7e33_agbm.tif'])
>> dp_s3_urls = S3FileLister(s3_prefixes)
>> list(dp_s3_urls)
Any help would be appreciated, thanks!
@Alyeko You might need to provide a region
to S3FileLister
. You can find your region under .aws/credentials
@ejguan, thanks, I have provided my region but it does not work.
Also, I realize that when I run !aws configure list
, in my notebook , my profile value is <not set>
. I try to configure my profile by running !aws configure set profile MY-IAM-PROFILE-NAME
, but it does not work.
I also obtained my credentials.csv but it does not have the User Name option so, I manually add my IAM-USERNAME and used
!aws configure import --csv '/path-to-credentisls.csv'
, but also it does not work and it still shows my profile value is <not set>
.
Any solution? Thanks!
@Alyeko In that case, have to consult from AWS team cc: @ydaiming
BTW, what would the result if using FSSpecFileLister
? You might need to install s3fs
and fsspec
to access S3 bucket.
I need to narrow down if this is the problem of aws sdk or configuration.
@Alyeko
I'm sorry to hear about the difficulty.
Just to clarify, you can e.g. !export AWS_REGION=us-west-2
to set the region, or provide the same alias through the Python function, as described here.
Access denied is a big word, but it means the AWS S3 service is reached. The issue is most likely due to credential configuration, as you're tracking, and most likely due to the wrong region configuration. I see that you've used this argument --no-sign-request
which implies a public bucket? In that sense, any proper credential configuration should work without issues. I personally didn't encounter your case, and sorry for not being more helpful in this case.
Edit: Could you let me know which region is this dataset in? I may try it on my own.
fwi, i had the same issue within SageMaker Studio. To solve this, I had to run the same command here within the Studio Notebook itself.
@Alyeko I'm sorry to hear about the difficulty. Just to clarify, you can e.g.
!export AWS_REGION=us-west-2
to set the region, or provide the same alias through the Python function, as described here.Access denied is a big word, but it means the AWS S3 service is reached. The issue is most likely due to credential configuration, as you're tracking, and most likely due to the wrong region configuration. I see that you've used this argument
--no-sign-request
which implies a public bucket? In that sense, any proper credential configuration should work without issues. I personally didn't encounter your case, and sorry for not being more helpful in this case.Edit: Could you let me know which region is this dataset in? I may try it on my own.
Thanks! Yes it is a public bucket, in the US East AWS Region. If I do not add the --no-sign-request
, i get an error (An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied).
Extra information:
In my AWS management console, I have created 1 user group which has 1 user. I created access key and secret access key for the user and it is these keys that I use to configure my notebook to access the data. My group has the AmazonS3FullAccess
policy activated.
@Alyeko In that case, have to consult from AWS team cc: @ydaiming
BTW, what would the result if using
FSSpecFileLister
? You might need to installs3fs
andfsspec
to access S3 bucket.I need to narrow down if this is the problem of aws sdk or configuration.
Thanks but FSSpeccFileLister did not work for me :( as I got the error...
Unable to locate credentials This exception is thrown by __iter__ of FSSpecFileListerIterDataPipe()
π Describe the bug
The code that I am running is -
The full readout that I am seeing is here -
I can successfully run the following code -
Versions
Unsure if relevant but I am on an EC2 instance Deep Learning AMI.