Closed rbavery closed 4 years ago
Hi @rbavery,
Thanks for working on all of this stuff.
Just as a note, the CosmiQ team is currently working with counsel to ensure that solaris
doesn't conflict with this new US export control guideline. I won't be able to merge PRs until we get the all-clear (hopefully in the next week or so). I'll post another comment here once that's clarified on our end and we can resume open source work.
Another thought: since geopandas
will be switching to pyproj.CRS
objects for tracking CRSs in the 0.7.0 release (see here), I'm thinking these may end up being a better object to track CRS with through solaris
rather than the rasterio ones. This is a pain given all the hard work you just put into implementing the rasterio ones though. Thoughts?
Oof (wrt to that interim rule). No problem though.
I saw that update from geopandas, it's good news overall. I think it makes sense to for vectors to internally use pyproj.CRS
. But rasterio uses it's own rasterio.crs.CRS
class. I'm not sure if we can consistently use the same CRS object for both rasters and vectors at all points in the code. And it's unclear to me if reprojecting geopandas gdf's using a rasterio.crs.CRS
will still work, so I've put out a question on twitter. But in any case where it makes sense to revamp the CRS handling again I can probably get to it.
Yeah, that makes sense. As far as I can tell (from trying myself), geopandas gdfs can't handle rasterio CRS
objects.
I'm about to start some work that will require me to get a bunch of stuff resolved (including CRSs in vector objects), and word on the street is that we'll be able to start pushing code up here again very soon. I'll create an issue for vector CRSs and assign myself to it if I start working on it.
Sounds good. Joris, one of the geopandas maintainers, says that geopandas gdfs should be able to handle the rasterio crs class since it has a to_wkt
method. https://twitter.com/jorisvdbossche/status/1227492555811504129
Hey @rbavery. I spent a bunch of time digging into what works well and what doesn't for interconverting between CRSs, and I think I've decided that passing around pyproj CRS objects is going to work better. Check out this gist for an example. I'm going to start working towards passing everything around as pyproj objects.
@nrweir Got it sounds good. I read the rasterio groups thread, it make sense to standardize on the pyproj class. I'll have time next week to work on wrapping up this PR with some tests. Would you prefer I wait and rebase off of your pyproj CRS updates? In any case, I'll remove this change I made in this PR since I think the missing CRS issue I was having might be fixed by your update.
... I wasn't able to save a geopandas geodataframe with a CRS that didn't have an EPSG code, which caused it to default to 4326. I added an optional argument to override the CRS in the geojson_to_px function with the CRS of the src image.
@rbavery that'd be great (rebasing off the new version). I'm going to do a 0.2.2 release today.
Merging #331 into dev will decrease coverage by
2.61%
. The diff coverage is80.88%
.
@@ Coverage Diff @@
## dev #331 +/- ##
==========================================
- Coverage 67.17% 64.56% -2.62%
==========================================
Files 72 72
Lines 5204 5328 +124
==========================================
- Hits 3496 3440 -56
- Misses 1708 1888 +180
Impacted Files | Coverage Δ | |
---|---|---|
setup.py | 0.00% <0.00%> (ø) |
|
solaris/raster/image.py | 53.00% <ø> (+0.05%) |
:arrow_up: |
tests/test_nets/test_datagen.py | 100.00% <ø> (ø) |
|
solaris/vector/polygon.py | 79.35% <45.45%> (-3.76%) |
:arrow_down: |
solaris/eval/challenges.py | 84.74% <50.00%> (-2.98%) |
:arrow_down: |
solaris/data/coco.py | 73.70% <62.50%> (-0.69%) |
:arrow_down: |
solaris/tile/raster_tile.py | 67.96% <65.62%> (-0.83%) |
:arrow_down: |
solaris/tile/vector_tile.py | 77.66% <75.00%> (-0.55%) |
:arrow_down: |
solaris/utils/core.py | 76.62% <88.88%> (+3.06%) |
:arrow_up: |
solaris/utils/geo.py | 70.19% <89.47%> (+3.53%) |
:arrow_up: |
... and 15 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 94ac29c...b64a097. Read the comment docs.
Hey @nrweir, thanks for the patience, I think this PR is finally ready for review! Instead of converting pyproj.CRs to rasterio.crs.CRS using epsg codes I followed @snowman2's advice and used this code snippet he suggested:
import rasterio
if LooseVersion(rasterio.__gdal_version__) >= LooseVersion("3.0.0"):
rio_crs = rasterio.crs.CRS.from_wkt(pyproj_crs.to_wkt())
else:
rio_crs = rasterio.crs.CRS.from_wkt(pyproj_crs.to_wkt("WKT1_GDAL"))
When I ran your gist with this if/else, it worked to properly convert where it failed before.
@rbavery thanks, this looks awesome - I'll review today.
Can you hold off on merging @nrweir ? I found that their is still an issue when using geojson_to_px_gdf
, where if the geojsons were saved with a CRS with no EPSG code, the function fails to load any projection information. I need to add the override_crs
option back in. I thought this would be solved by the other CRS updates we made, but it looks like geojsons may only be able to store EPSG code CRS information?
The vector_tiler maintains this CRS info right up until the gdfs are saved, so I think it is an issue with the geojson format.
Ok, I think this should be good to go, if the way I addressed the reviews makes sense.
Description
This PR addresses the following issues:
https://github.com/CosmiQ/solaris/issues/327 by allowing an option to set areas outside an aoi_boundary to the src image's nodata value when tiling.
https://github.com/CosmiQ/solaris/issues/328 by adding a method to the
RasterTiler
class that fills tiles with the mean of the src image. If this is run after restrict_by_aoi, all nodata values outside of the aoi are filled, along with other nodata values in the image.geojson2coco
would fail if the geodataframe input had any MultiPolygon or GeometryCollection types. I added an optional argument to ignore these when converting the geodataframe to COCO formatThere was a bug in the raster_tiler's use of the split_geom function, where the aoi boundary was not intersected with the src_image boundary prior to tiling, causing many nodata areas outside the src_img boundary to be tiled.
EDIT: Here's a list of all changes, from the CHANGELOG:
Changed
fill_all_nodata
to fill nodata with"mean"
or custom value (#328)geojson2coco
since these cannot be converted to COCO.solaris.vector.mask.geojsons_to_masks_and_fill_nodata
, which rasterized vector labels according to raster tile extents. Fills nodata areas in raster tile and corresponding rasterized label raster.Fixed
_check_crs
to convertpyproj.CRS
or pass throughrasterio.crs.CRS
class when rasterio crs object is required (for example, reprojecting in the tilers)Please delete options that are not relevant.
Checklist:
If your PR does not fulfill all of the requirements in the checklist above, that's OK! Just prepend [WIP] to the PR title until they are all satisfied. If you need help, @-mention a maintainer and/or add the Status: Help Needed label.