Open kbgg opened 2 weeks ago
It looks like the data proxy is not handling the case properly when the end of the range is not specified.
boto3 will send a request with the range bytes=41943040-
which should return the remaining bytes of the file however the data proxy is returning the entire file.
This matches the file size that is downloaded, 41943040 + 43912147 = 85855187 which matches the file that is downloaded through boto3
Description of Bug:
When downloading a large file through boto3, the completed file is unexpectedly large and corrupt.
Steps to Reproduce:
curl -O https://data.source.coop/kerner-lab/fields-of-the-world-austria/boundaries_austria_2021.parquet
s3_client = boto3.client("s3", endpoint_url="https://data.source.coop") with open("boundaries_austria_2021_boto3.parquet", "wb") as f: s3_client.download_fileobj("kerner-lab", "fields-of-the-world-austria/boundaries_austria_2021.parquet", f)
MD5 (boundaries_austria_2021_boto3.parquet) = 7249b300347b14f13d6652c98b266350 MD5 (boundaries_austria_2021.parquet) = e8f3dc1683acd316a0668d42802fa6a4 -rw-r--r-- 1 kevin staff 43912147 Nov 6 07:41 boundaries_austria_2021.parquet -rw-r--r-- 1 kevin staff 85855187 Nov 6 07:43 boundaries_austria_2021_boto3.parquet