weecology / DeepForest

Python Package for Airborne RGB machine learning
https://deepforest.readthedocs.io/
MIT License
490 stars 172 forks source link

CRS not matching in utilities.shapefile_to_annotations #654

Closed yby026 closed 2 months ago

yby026 commented 6 months ago

Hi, when I try use shapefile_to_annotations to convert my labels from a shapefile, an error raises as the crs does not match between the image and shapefile. It seems the crs of the shapefile containing an unknown vertical crs but not in the image, and the vertical crs cannot be removed from shapefile.

Codes:

ann = utilities.shapefile_to_annotations(shapefile = annotation + "bounding_box.shp",
    rgb = img)

ann.shape

The error message is here:

ValueError Traceback (most recent call last)

in () ----> 1 ann = utilities.shapefile_to_annotations(shapefile = annotation +"bounding_box.shp", 2 rgb = img) 3 4 ann.shape /usr/local/lib/python3.10/dist-packages/deepforest/utilities.py in shapefile_to_annotations(shapefile, rgb, buffer_size, geometry_type, savedir) 298 # Check matching the crs 299 if not gdf.crs.to_string() == raster_crs.to_string(): --> 300 raise ValueError("The shapefile crs {} does not match the image crs {}".format( 301 gdf.crs, src.crs)) 302 ValueError: The shapefile crs COMPD_CS["NZGD2000 / New Zealand Transverse Mercator 2000 + unknown",PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",GEOGCS["NZGD2000",DATUM["New_Zealand_Geodetic_Datum_2000",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6167"]],PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",173],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",1600000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]],VERT_CS["unknown",VERT_DATUM["unknown",2005],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Gravity-related height",UP]]] does not match the image crs PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",GEOGCS["NZGD2000",DATUM["New_Zealand_Geodetic_Datum_2000",SPHEROID["GRS 1980",6378137,298.257222101004,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6167"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4167"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",173],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",1600000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","2193"]]

The shapefile was created in arcgis pro 3.1 based on the projection of image, and both shapefile and image has been reprojected use each other's coordination system before training.

Codes are running using google colab.

Not sure what would be the problem here.

ethanwhite commented 5 months ago

This error suggests that DeepForest thinks the CRS's don't match, so let's get a better look at how they are stored. Can you run the following code replacing shapefile with the path to your shapefile and rgb with the path to your raster and then post the output.

import geopandas as gpd

gdf = gpd.read_file(shapefile)
print(f"Shapefile CRS is:\n{gdf.crs}")

with rasterio.open(rgb) as src:
    raster_crs = src.crs
print(f"Raster CRS is:\n{raster_crs}
yby026 commented 5 months ago

Here's the output:

Shapefile CRS is: COMPD_CS["NZGD2000 / New Zealand Transverse Mercator 2000 + unknown",PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",GEOGCS["NZGD2000",DATUM["New_Zealand_Geodetic_Datum_2000",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6167"]],PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",173],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",1600000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]],VERT_CS["unknown",VERT_DATUM["unknown",2005],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Gravity-related height",UP]]] Raster CRS is: PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",GEOGCS["NZGD2000",DATUM["New_Zealand_Geodetic_Datum_2000",SPHEROID["GRS 1980",6378137,298.257222101004,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6167"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4167"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",173],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",1600000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","2193"]]

ethanwhite commented 5 months ago

OK, so ArcGIS is storing two things with the same project using different text representations, which is a little weird. Can you run this code to see if we can get a match this way:

import geopandas as gpd

gdf = gpd.read_file(shapefile)
print(f"Shapefile CRS is:\n{gdf.crs.to_epsg()}")

with rasterio.open(rgb) as src:
    raster_crs = src.crs.to_epsg()
print(f"Raster CRS is:\n{raster_crs}
yby026 commented 5 months ago

Hi there,

After running the code it returned as: image

ethanwhite commented 5 months ago

OK, thanks. It looks like something about the way the projections were generated in ArcGIS is non-standard causing Python's spatial tools to have trouble working with them. For example, with a normal shapefile this is what happens with the first 3 lines of that code:

>>> import geopandas as gpd
>>> shapefile = "/path/to/file.shp"
>>> gdf = gpd.read_file(shapefile)
>>> gdf.crs.to_epsg()
32617

The number produced will differ depending on the projection.

I'd recommend trying to reproject both spatial layers into a specific EPSG code, which looks like it's probably 2193 for you (New Zealand Transverse Mercator 2000). I'd probably try to do this in ArcGIS to get out files with standard projection information since I'm not sure whether other tools will be able to work with this projection metadata.

You could also try changing the projection (without reprojecting) once you're in Python (e.g., gdf.set_crs(epsg=2193, inplace=True). As long as this is just a metadata problem this would probably fix your issue (if you also make the equivalent change for the raster), but personally I'd be more comfortable getting the files our of the tool I created them in with a clean chunk of projection metadata.

ethanwhite commented 2 months ago

Closing this since we haven't heard back in a couple of months. @yby026 - definitely feel free to reopen if you come back to this.