Closed denizsurmeli closed 1 year ago
The aws-sdk-go version that we use have a bug that it logs the response header regardless of the log level set in s5cmd
. Here is the related issue, so I will bump to the version that has the fix and check if anything breaks.
Minio panics during SELECT
queries. Open to suggestions about testing.
panic: runtime error: index out of range [0] with length 0
goroutine 779 [running]:
github.com/minio/minio/internal/s3select/csv.NewReader.func1({0x5929480?, 0xc0093743f0?})
github.com/minio/minio/internal/s3select/csv/reader.go:306 +0x425
github.com/minio/minio/internal/s3select/csv.(*Reader).startReaders.func3()
github.com/minio/minio/internal/s3select/csv/reader.go:244 +0x19c
created by github.com/minio/minio/internal/s3select/csv.(*Reader).startReaders
github.com/minio/minio/internal/s3select/csv/reader.go:233 +0x705
Minio panics during
SELECT
queries. Open to suggestions about testing.panic: runtime error: index out of range [0] with length 0 goroutine 779 [running]: github.com/minio/minio/internal/s3select/csv.NewReader.func1({0x5929480?, 0xc0093743f0?}) github.com/minio/minio/internal/s3select/csv/reader.go:306 +0x425 github.com/minio/minio/internal/s3select/csv.(*Reader).startReaders.func3() github.com/minio/minio/internal/s3select/csv/reader.go:244 +0x19c created by github.com/minio/minio/internal/s3select/csv.(*Reader).startReaders github.com/minio/minio/internal/s3select/csv/reader.go:233 +0x705
Minio panics only on CSV queries that you don't specify the delimiter, which is weird that it does not validate the delimiter in the request, JSON queries seemed fine.
PR also fixes #357 @denizsurmeli
Refactored/fixed the comments.
I'd like to share my expectations while playing with this feature:
./s5cmd select csv --delimiter "," -e 'select * from s3object' 's3://bucket/ibrahim/s5cmd-select/prices.csv'
{"_1":"id","_2":"name","_3":"price"}
{"_1":"1","_2":"avocado","_3":"3.99"}
{"_1":"2","_2":"banana","_3":"1.99"}
{"_1":"3","_2":"cabbage","_3":"0.99"}
./s5cmd select csv --output-format csv --delimiter "\t" -e 'select * from s3object' 's3://bucket/ibrahim/s5cmd-select/prices.tsv'
ERROR "select csv --delimiter=\\t --query=select * from s3object --output-format=csv s3://bucket/ibrahim/s5cmd-select/prices.tsv": InvalidRequestParameter: The value of parameter FieldDelimiter is invalid. Please check the service documentation and try again. status code: 400, request id: ...
I'd like to share my expectations while playing with this feature:
- Expected CSV output if I query CSV objects (same thing for TSV, or other formats).
./s5cmd select csv --delimiter "," -e 'select * from s3object' 's3://bucket/ibrahim/s5cmd-select/prices.csv' {"_1":"id","_2":"name","_3":"price"} {"_1":"1","_2":"avocado","_3":"3.99"} {"_1":"2","_2":"banana","_3":"1.99"} {"_1":"3","_2":"cabbage","_3":"0.99"}
- How do I query TSV files?
./s5cmd select csv --output-format csv --delimiter "\t" -e 'select * from s3object' 's3://bucket/ibrahim/s5cmd-select/prices.tsv' ERROR "select csv --delimiter=\\t --query=select * from s3object --output-format=csv s3://bucket/ibrahim/s5cmd-select/prices.tsv": InvalidRequestParameter: The value of parameter FieldDelimiter is invalid. Please check the service documentation and try again. status code: 400, request id: ...
For the first issue, you are right, I have implemented the feature. For the second question, I have added an example query to the help command. I have also extended the flag descriptions.
Applied the suggestions and refactored the code. @igungor
This PR extends the support for AWS S3's Select API.
Resolves #494, resolves #357 .