microsoft / PlanetaryComputerDataCatalog

Data catalog for the Microsoft Planetary Computer
https://planetarycomputer.microsoft.com
MIT License
35 stars 15 forks source link

Would it be possible to change the Azure Blob Storage API version? #479

Open rachtsingh opened 2 hours ago

rachtsingh commented 2 hours ago

Hello,

I'm accessing ECMWF real-time open data hosted for example at this url: https://ai4edataeuwest.blob.core.windows.net/ecmwf/20240201/06z/0p25/scda/20240201060000-6h-scda-fc.grib2. However, it looks like HEAD requests to that URL tell me that the server doesn't accept HTTP range requests:

{
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Expose-Headers": "x-ms-request-id,Server,x-ms-version,Content-Type,Last-Modified,ETag,Content-MD5,x-ms-lease-status,x-ms-blob-type,Content-Length,Date,Transfer-Encoding",
    "Content-Length": "63162417",
    "Content-MD5": "qqfa7u8ezDERRAnR1WD1Zw==",
    "Content-Type": "application/octet-stream",
    "Date": "Tue, 24 Sep 2024 21:19:36 GMT",
    "ETag": "0x8DC2323E4CAFF28",
    "Last-Modified": "Thu, 01 Feb 2024 12:47:04 GMT",
    "Server": "Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0",
    "x-ms-blob-type": "BlockBlob",
    "x-ms-lease-status": "unlocked",
    "x-ms-request-id": "0cb1b60d-801e-0017-6bc7-0e5979000000",
    "x-ms-version": "2009-09-19"
}

(it should say Accept-Ranges: bytes). As mentioned on the Planetary Computer page the index files are published to allow reading a subset of the data (and HTTP range requests are the easiest way).

I think the issue is "x-ms-version", which indicates that this is using a pretty old blob storage API version. Based on this Github issue I think fixing it should be pretty quick, though I'm not aware of any costs of changing (my uninformed guess is that it should be low since Azure recommends using the latest version).

In the meantime I'll download the full .grib files and then subset them on device, but I wanted to flag this since it'll save everyone a lot of bandwidth.

rachtsingh commented 2 hours ago

Actually, the server gave me the correct range when I made a GET request later, so it looks like range requests are supported. Feel free to close if Accept-Ranges: bytes isn't meant to be returned.