Open scottyhq opened 3 years ago
@scottyhq I haven't run your notebook but couple weeks ago we had a discussion about the difference between s3/https URL in the cogeotiff slack. We found out that using S3 url was saving one request (GetFileSize)
$ tilebench profile s3://rio-tiler-dev/data/eu_webAligned_256pxWEBP.tif --tile 5-10-9| jq
{
"LIST": {
"count": 0
},
"HEAD": {
"count": 0
},
"GET": {
"count": 2,
"bytes": 32768,
"ranges": [
"0-16383",
"229376-245759"
]
},
"Timing": 0.4968528747558594
}
$ tilebench profile https://rio-tiler-dev.s3.amazonaws.com/data/eu_webAligned_256pxWEBP.tif --tile 5-10-9 | jq
{
"LIST": {
"count": 0
},
"HEAD": {
"count": 1
},
"GET": {
"count": 2,
"bytes": 32768,
"ranges": [
"0-16383",
"229376-245759"
]
},
"Timing": 0.4858889579772949
}
ahhhhh, so GetFileSize is used on http sources, if you use s3 GDAL get the filesize from the response of the first GET request
ref: https://cogeotiff.slack.com/archives/C01DE57GLHE/p1613141057016300
@rmg55 put together a nice notebook here looking at s3 versus http access: https://github.com/rmg55/CloudDAAC_Binders , would be great to include more benchmarking information and suggestion in this repository
also wanted to link over to https://github.com/pangeo-data/pangeo-integration-tests/issues/1