Closed whitehawk closed 2 days ago
Allure report https://allure.adsw.io/launch/83744
Allure report https://allure.adsw.io/launch/83840
according to the CURL documentation, if data to POST is provided using the CURLOPT_READFUNCTION and CURLOPT_READDATA options (which is our case), "you must set the size of the data with the CURLOPT_POSTFIELDSIZE or CURLOPT_POSTFIELDSIZE_LARGE options".
The documentation doesn't quite have the same wording:
You can set the total size of the data you are sending by using CURLOPT_INFILESIZE_LARGE or CURLOPT_POSTFIELDSIZE_LARGE, depending on the type of transfer. For some transfer types it may be required and it allows for better error checking.
according to the CURL documentation, if data to POST is provided using the CURLOPT_READFUNCTION and CURLOPT_READDATA options (which is our case), "you must set the size of the data with the CURLOPT_POSTFIELDSIZE or CURLOPT_POSTFIELDSIZE_LARGE options".
The documentation doesn't quite have the same wording:
You can set the total size of the data you are sending by using CURLOPT_INFILESIZE_LARGE or CURLOPT_POSTFIELDSIZE_LARGE, depending on the type of transfer. For some transfer types it may be required and it allows for better error checking.
Refer to CURLOPT_POST:
"When providing data with a callback, you must transmit it using chunked transfer-encoding or you must set the size of the data with the CURLOPT_POSTFIELDSIZE or CURLOPT_POSTFIELDSIZE_LARGE options."
That is our case, in the S3RESTfulService::post()
, CURLOPT_POST is set to 1.
Attempt to insert data into a writable external table with data file located on S3 storage ended with warning "InvalidRequest : You must specify at least one part". Though GPDB reported that data was inserted, no file had been created in the S3 bucket. So, actually the insert operation failed.
Why not also make it so that in such cases an error is issued instead of a warning?
Attempt to insert data into a writable external table with data file located on S3 storage ended with warning "InvalidRequest : You must specify at least one part". Though GPDB reported that data was inserted, no file had been created in the S3 bucket. So, actually the insert operation failed.
Why not also make it so that in such cases an error is issued instead of a warning?
It would be a good update, but I think it is better to do it separately. Let's open a new ticket for it and fix with independent commit.
Attempt to insert data into a writable external table with data file located on S3 storage ended with warning "InvalidRequest : You must specify at least one part". Though GPDB reported that data was inserted, no file had been created in the S3 bucket. So, actually the insert operation failed.
Why not also make it so that in such cases an error is issued instead of a warning?
It would be a good update, but I think it is better to do it separately. Let's open a new ticket for it and fix with independent commit.
Could you open new ticket for this?
Attempt to insert data into a writable external table with data file located on S3 storage ended with warning "InvalidRequest : You must specify at least one part". Though GPDB reported that data was inserted, no file had been created in the S3 bucket. So, actually the insert operation failed.
Why not also make it so that in such cases an error is issued instead of a warning?
It would be a good update, but I think it is better to do it separately. Let's open a new ticket for it and fix with independent commit.
Could you open new ticket for this?
ADBDEV-6562
Was this always bugged and no one ever bothered to check or curl stopped accepting
..INFILESIZE
for POST at some point? It's weird that this line stayed there for 8 years.
Hard to say now. I see that this behavior was documented in the CURL at least from 2014 (since the commit). I may guess that the S3 storage, used for testing at the time this code was presented, was tolerable for the missing XML body (as the header aleady contains the upload ID, thus the S3 storage theoretically has enough information to complete the upload even without the payload, though it violates the protocol spec).
Allure report https://allure.adsw.io/launch/84109
Fix missing XML body in S3 CompleteMultipartUpload command
Problem: Attempt to insert data into a writable external table with data file located on S3 storage ended with warning "InvalidRequest : You must specify at least one part". Though GPDB reported that data was inserted, no file had been created in the S3 bucket. So, actually the insert operation failed.
Cause: Data insertion into S3 external table is done via S3 Multipart upload. This approach requires the following commands:
CompleteMultipartUpload is a POST method request, that should have XML payload with a list of all parts. But the http analysis showed that actual POST request for CompleteMultipartUpload didn't contain the payload. As a consequence, the S3 storage replied with an error.
The reason for missing payload is the following: according to the CURL documentation, if data to POST is provided using the CURLOPT_READFUNCTION and CURLOPT_READDATA options (which is our case), "you must set the size of the data with the CURLOPT_POSTFIELDSIZE or CURLOPT_POSTFIELDSIZE_LARGE options". While in the 'S3RESTfulService::post()' the size of the data was set via CURLOPT_INFILESIZE_LARGE CURL option.
Fix: CURL option CURLOPT_INFILESIZE_LARGE is replaced with CURLOPT_POSTFIELDSIZE_LARGE.
Tests are not presented, as there is no existing test infrastructure to make an auto test with S3 storage.