ruithnadsteud / yt_georaster

A package for handling geotiff files and georeferenced datasets within yt.
1 stars 2 forks source link

Polygon and dataset CRS objects kwarg addition #50

Closed deastwoo closed 2 years ago

deastwoo commented 3 years ago

I've made changes to data structures and the polygon selection to enable the ability to provide a CRS object for polygon data selection and also a working CRS for a dataset. I think it would be good to have more of a discussion about these, particularly the latter.

For the polygon data container the CRS key word argument gives the user the ability to provide the CRS of the input polygon (or list of polygons), like this: ds.poly(filename, crs=src_crs) where src_crs can be a rasterio CRS object, a string (e.g. 'epsg:3857'), the EPSG id (e.g. 3857) or a "proj4-style" dict (basically any form that rasterio would normally accept). The polygon is the reprojected to match the CRS of the dataset object (see #34). I also added the ability for the user to provide a polygon or multipolygon shapely object as this seemed so closely linked to what I was trying to do. I couldn't get my head round doing it without this option anyway. Having tested this quickly I believe this is now working properly but I need to add testing routines for it.

The data structures changes are all to do with adding the CRS key word argument to the dataset load (see #48). Now a user can provide a CRS to work in by doing ds = yt.load(filename, crs=my_crs) where my_crs can again basically be any form that rasterio would normally accept for a CRS. The data will then be reprojected at the usual stage (at selection when a container is constructed). I haven't fully tested this but I believe there will be outstanding problems. The problem is that to get it to work I have forced the reprojected data to have the same number of pixels in the x and y direction which introduces rectangular pixels. I think it would be better to maintain nice square pixels but I couldn't get this working. Let's discuss this further when we can.

I've also added the step so that yt now gets the units (when available) from the rasters CRS (#31). I realised the capability I use for this in rasterio only works for projected datasets, meaning this won't work for angular units. I suppose we need to think about what we can do with this.

deastwoo commented 3 years ago

Here's a short script I've used to test the changes:

from pathlib import Path
import fiona
from rasterio.crs import CRS
from shapely.geometry import Polygon

import yt
import yt.extensions.georaster

if __name__ == '__main__':
    # get path to images (directory contains landsat + sen2)
    path_to_images = Path("/Users/dan.eastwood/Downloads/Landsat-8_sample_L2")

    # select one tif and one jp2
    fns = sorted(path_to_images.glob("*.TIF"))  #+ \
        # sorted(path_to_images.glob("*.jp2"))

    # convert Path objects to strings
    fns = [str(f) for f in fns]
    # load dataset of the two files
    ds = yt.load(*fns, crs='epsg:3857')
    print(ds.parameters)

    filename = "/Users/dan.eastwood/Downloads/Resources_1/mabira_forest.shp"

    print("Polygon and crs as input:")
    with fiona.open(filename, "r") as shapefile:
        shapes_from_file = [feature["geometry"] for feature in shapefile]
        src_crs = CRS.from_dict(**shapefile.crs)  # shapefile crs

    polygons = [
        Polygon(shapes_from_file[i]["coordinates"][0])
        for i in range(len(shapes_from_file))
    ]
    poly = ds.polygon(polygons, crs=src_crs)

    print (poly["LC08_L2SP_171060_20210227_20210304_02_T1", "red"])
    print (poly["index", "area"].sum())

    print("shapefile input:")
    poly = ds.polygon(filename)

    print (poly["LC08_L2SP_171060_20210227_20210304_02_T1", "red"])
    print (poly["index", "area"].sum())

And the output:

yt : [INFO     ] 2021-07-12 14:36:25,588 Dataset CRS EPSG:3857 units are 'metre'. 
yt : [WARNING  ] 2021-07-12 14:36:25,662 x and y pixels have different sizes.
yt : [INFO     ] 2021-07-12 14:36:25,668 Parameters: domain_dimensions         = [7581 7741    1]
yt : [INFO     ] 2021-07-12 14:36:25,669 Parameters: domain_left_edge          = [3534960.49124483 -117256.38592324       0.        ] m
yt : [INFO     ] 2021-07-12 14:36:25,676 Parameters: domain_right_edge         = [3.76251113e+06 1.16648563e+05 1.00000000e+00] m
{'driver': 'GTiff', 'dtype': 'uint16', 'nodata': None, 'width': 7581, 'height': 7741, 'count': 1, 'crs': CRS.from_epsg(3857), 'transform': Affine(30.015913050540732, 0.0, 3534960.491244827,
       0.0, -30.216373752746005, 116648.56329676745), 'res': (30.015913050540732, 30.216373752746005), 'units': 'metre'}
Polygon and crs as input:
yt : [INFO     ] 2021-07-12 14:36:25,932 Number of features in poly object: 1
yt : [INFO     ] 2021-07-12 14:36:26,007 Resampling ('LC08_L2SP_171060_20210227_20210304_02_T1', 'L8_B4_30m'): 30.0 metre to 30.015913050540732 metre.
[ 8558.  8423.  8318. ... 10591. 10448. 10473.] dimensionless
333.72852753960626 km**2
shapefile input:
yt : [INFO     ] 2021-07-12 14:36:26,057 Number of features in poly object: 1
yt : [INFO     ] 2021-07-12 14:36:26,100 Resampling ('LC08_L2SP_171060_20210227_20210304_02_T1', 'L8_B4_30m'): 30.0 metre to 30.015913050540732 metre.
[ 8558.  8423.  8318. ... 10591. 10448. 10473.] dimensionless
333.72852753960626 km**2

The polygon selection seems to be working but note the warning from yt "x and y pixels have different sizes." and also the transform and res properties that are printed.

deastwoo commented 2 years ago

I am going to merge this branch into master despite that installation error for circleci as this is also failling on the master branch. This will be resolved separately.