podaac / data-subscriber

Subscribe and bulk download collections of data at PO.DAAC
Apache License 2.0
83 stars 29 forks source link

Harmony subsetting results in "413 request entity too large" in cases where many granule ids sent #164

Open skorper opened 6 months ago

skorper commented 6 months ago

When we send a subsetting request to Harmony with too many granule IDs, we get an error "413 Request entity too large". After discussing this with the Harmony team, they suggested switching to a POST request (instead of the current GET request) which should move the granule Id's to the form body and allow us to send many granule IDs. However, harmony-py does not currently support this, so we'd need to manually construct the Harmony request in this case. A ticket was created for harmony-py, HARMONY-1721, which adds support for this use-case.

Issue can be recreated like so:

podaac-data-subscriber -c SWOT_L2_LR_SSH_BASIC_2.0 -d ./data --start-date 2023-05-01T00:00:00Z --end-date 2024-12-21T00:00:00Z --subset -b="120,-30,160,20"

Note: this only impacts the data subscriber and not the data downloader. When submitting Harmony requests with the data downloader, we don't submit granule IDs but rather forward the spatiotemporal bounds the user provided.