Closed lebinh closed 8 years ago
@lebinh will this break other parts like redshift and s3?
So far no, all tests passed in PyClient test (2 cases failed because of missing data file on HDFS for that instance). UT for DDF passed on my machine but somehow failed on Jenkins, not sure why yet.
Hm... this looks familiar, I think @Huandao0812 may know something about this, he encountered it once
Description and related tickets, documents
Replace
s3n
file system withs3a
as it is faster and is stabilized enough in Hadoop 2.7 (https://wiki.apache.org/hadoop/AmazonS3).Performance comparison
Create DDF from S3 with PyClient![image](https://cloud.githubusercontent.com/assets/234997/15957421/5c5ddd6a-2f19-11e6-8871-f35bafe3ccb3.png)
Create dataset from S3 in BigApps![image](https://cloud.githubusercontent.com/assets/234997/15957419/54d22ad8-2f19-11e6-924f-17c13142af36.png)
Related:
Reviewers:
Breaking changes & backward compatible issues
NA as
s3a
should be a drop in replacement fors3n
.PR Progress
Make sure all checkboxes below are checked before merged