Accessing Azure, Google Cloud Storage, Minio, and S3 Object Stores with PXF -> Reading CSV and Parquet Data from S3 Using S3 Select
1.
The first line in the document says "The PXF S3 connector supports reading certain CSV- and Parquet-format data from S3 using the Amazon S3 Select service".
In the above line for the word CSV- format is truncated.(for eg: It should be CSV-format)
2.
You can use the PXF S3 Connector with S3 Select to read:
gzip- or bzip2-compressed CSV files
Parquet files with gzip- or snappy-compressed columns
In the above lines for the word gzip- compressed is truncated.(for eg: It should be gzip-compressed)
3.
Specifying the CSV File Compression Type:
If the CSV file is gzip- or bzip2-compressed, use the COMPRESSION_CODEC custom option in the LOCATION URI to identify the compression codec alias.
In the above lines for the word gzip- compressed is truncated.(for eg: It should be gzip-compressed)
PXF supports column projection as well as predicate pushdown for AND, OR, and NOT operators when using S3 Select.
In the above line there is an extra comma (,) after OR. It can be removed.
Ref: https://gpdb.docs.pivotal.io/pxf/6-2/using/read_s3_s3select.html
Accessing Azure, Google Cloud Storage, Minio, and S3 Object Stores with PXF -> Reading CSV and Parquet Data from S3 Using S3 Select 1. The first line in the document says "The PXF S3 connector supports reading certain CSV- and Parquet-format data from S3 using the Amazon S3 Select service".
In the above line for the word CSV- format is truncated.(for eg: It should be CSV-format)
2. You can use the PXF S3 Connector with S3 Select to read:
gzip- or bzip2-compressed CSV files Parquet files with gzip- or snappy-compressed columns
In the above lines for the word gzip- compressed is truncated.(for eg: It should be gzip-compressed)
3. Specifying the CSV File Compression Type:
If the CSV file is gzip- or bzip2-compressed, use the COMPRESSION_CODEC custom option in the LOCATION URI to identify the compression codec alias.
In the above lines for the word gzip- compressed is truncated.(for eg: It should be gzip-compressed)
In the above line there is an extra comma (,) after OR. It can be removed.