Closed jjacobelli closed 1 year ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
To provide additional context...
data.rapids.ai serves the contents of the rapidsai-data
S3 bucket via an AWS CloudFront distribution.
The benefits of using a CloudFront distribution are:
Therefore, it's in everyone's best interest to start using the new data.rapids.ai URLs for downloading datasets.
At some point in the future, the S3 URLs will be disabled and datasets will only be retrievable from data.rapids.ai.
/merge
Since clx
is scheduled to be deprecated soon, I will admin merge this PR despite the CI failures (which are unrelated to these changes).
I don't want this repository to be archived with the S3 URLs.
Update Rapids datasets download URL to reduce latency and costs. This PR also replace the usage of
s3fs
byrequests
to get Rapids datasets as we are not using an S3 URL anymore