Closed jreadey closed 1 month ago
One idea is to have the DN make two async requests on a rangeget: One directly to S3 for the exact number of bytes needed and one to the rg proxy which will read pagesize bytes from S3 (if the data is not already present). If the DN->S3 requests returns first, use that and cancel the rg requests. If the rg request returns first (likely because the data was found in the cache, cancel the DN->S3 request. Idea is not to have the DN wait on the rg proxy reading more than is needed bytes from S3.
The Rangeget Proxy has been removed, and performance improvements for rangegets has been directed towards Intelligent Range Gets (https://github.com/HDFGroup/nasa_cloud/milestone/1), and hyper chunking.
Rangeget proxy seems not to make much difference in the benchmark even though it was designed with datasets such as IceSat-2 in mind. Investigate ways to improve performance.