Open knaaptime opened 4 years ago
I've been doing s3 stuff with rasterio and it handles an s3 path without the need for a context mananger, as long as it's a free bucket. rasterstats doesn't change the path at all so it seems to work fine without any adjustment
import rasterstats
from shapely.geometry import Polygon
s3_path = 's3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'
p=Polygon([(465764.5, 2428342.0), (465104.1, 2404237.9), (492840.3, 2402586.9)])
rasterstats.zonal_stats(p, s3_path)
Out[4]: [{'min': 10150.0, 'max': 13163.0, 'mean': 10490.125291650986, 'count': 372020}]
It can be wrapped in a special rasterio session for a requester pays bucket too. Like with this naip imagery.
from rasterio.session import AWSSession
import rasterio
import rasterstats
from shapely.geometry import Polygon
p=Polygon([(525794.7, 4876393.3), (525711.3, 4875371.7),(526910.1, 4875413.4)])
s3_path = 's3://naip-visualization/wy/2019/60cm/rgb/44104/m_4410459_se_13_060_20190828.tif'
session = AWSSession(requester_pays=True)
with rasterio.Env(session):
stats = rasterstats.zonal_stats(p, s3_path)
stats
[{'min': 59.0, 'max': 220.0, 'mean': 118.68374314241665, 'count': 1696115}]
rasterstats only reads with the rasterio window method, so as long as the s3 files are cloud optimized geotiffs then it will not download the full raster.
For anyone looking to do this in a mock S3 tool like minio, I'll save you a google search:
https://github.com/rasterio/rasterio/issues/1293#issuecomment-502651617
Just add the following variables to your rasterio.Env:
import rasterstats
import rasterio
from shapely.geometry import Polygon
s3_path = 's3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'
p = Polygon([(465764.5, 2428342.0), (465104.1, 2404237.9), (492840.3, 2402586.9)])
with rasterio.Env(AWS_HTTPS='NO', GDAL_DISABLE_READDIR_ON_OPEN='YES', AWS_VIRTUAL_HOSTING=False, AWS_S3_ENDPOINT='localhost:9000'):
stats = rasterstats.zonal_stats(p, s3_path)
Hi, thanks for this great package. I'm wondering if it's possible to pass an open raster (as opposed to simply the path) to zonalstats (without converting to
ndarray
first).My use case is that I'm reading a raster file from a public s3 bucket, which requires the following rasterio context:
which works great by itself. But now I need to calculate zonal stats and I cannot make
rs.zonal_stats
use the rasterio environment. Would it be possible to modify thers.zonal_stats
so that it accepts an openrasterio.DatasetReader
? (or allow passing the context)e.g.