digitalearthpacific / dep-coastlines

GNU General Public License v3.0
3 stars 1 forks source link

Projections #34

Open jessjaco opened 6 months ago

jessjaco commented 6 months ago

The approach used in DEA & DE Africa coastlines is to load source landsat data in its "native" projection as much as possible, and when this is not possible (due to different source projections in the same cell) to resample using a smoother (eg cubic convolution).

By default, odc.stac.load uses the most common projection in a list of projections (I learned this by asking the authors in a public issue, it's not in the documentation), so it does what we want. The issue is that the most common projection can vary based on what data are available in a given year. So each annual mosaic may have a different projection. Then, when stacking them for coastline generation, things go awry (since xr.concat is not sensitive to projections).

Here's are some difficulties in addressing this:

  1. Using the crs="utm" arg of odc.stac.load doesn't work because the most sensible projection (as determined by odc.stac.load) may not be what landsat uses. This is because landsat always uses utm north projections, never south. This shouldn't cause an issue (the reprojection should work), but it does and I'm not sure why. Basically we end up with empty mosaics. Not worth looking into further at this point since odc.stac.load would need to be changed to fix.
  2. There's no way to pass a "live" crs to odc.stac.load, it can only be defined when the loader is defined. This is an issue I am responsible for. I don't have a simple solution, short of writing a custom loader. When I previously used stackstac, I accomplished this by tracking the epsgs using _current_epsg, but I don't think even that combated this issue entirely. (It's more complicated now since we changed the grid, among other things). In theory if we went this route, I'd probably save the best crs for each cell as a row in the grid, and then it could be used in the loader. So a minor rewrite, but a rewrite.
  3. Non-matching source data could be aligned in the MosaicLoader, but we would need to record the best crs for each cell. I tried using the most common crs in the list, but for different composite sets (e.g. 1 vs 3 year), the most common crs changed.

For now I'm reprojecting in the mosaic processor by using the crs of the tide data (since the tide data is over the whole time series, and should therefore have the overall most common crs set).

jessjaco commented 6 months ago

That solution (reprojecting to tide data) didn't work because we can get the CRS but there's no way to get the other bounds and we end up with mixed / shifted coordinates. Falling back to option 3 above, specifically:

  1. Recording the best CRS in the grid when creating (for now defined as the zone with max are in the cell) and
  2. Subclassing OdcLoader and adding the CRS as a kwarg before loading
jessjaco commented 6 months ago

My only concern with this approach is that I seeing some jagged lines where I know things are being reprojected (PNG). It's either the data itself, or the fact I'm using the nearest transform when reprojection. I'm going to try cubic and see if it makes a difference

jessjaco commented 5 months ago

Note that this may make tide data in a different crs than mosaics. FOr now I reproject tides_lr when I load, but we should fix this if we ever rerun tides

jessjaco commented 5 months ago

Aaaand we need to rerun tides, but it's due to the intersecting pathrow fix I think