gjoseph92 / stackstac

Turn a STAC catalog into a dask-based xarray
https://stackstac.readthedocs.io
MIT License
245 stars 49 forks source link

`xy_coords` introduces unexpected (?) pixel shift #68

Closed maawoo closed 2 years ago

maawoo commented 3 years ago

Hi all! Either the shift in pixels that I have observed is actually unexpected and needs to somehow be adjusted in the to_coords function or I'm simply missing something or misunderstanding the concept behind geotransforms and it's only unexpected for me at the moment. Hope someone can help me out either way 🙂

In the following screenshots, the colored image is always the original raster, while the greyscale image is always the one created with these lines of code:

scene_stack = stackstac.stack(items=stac_obj, xy_coords=xy_coords, dtype='float32', rescale=False)
scene_stack.rio.to_raster('./scene_stack.tif')

Transform and projection of the raster: 'proj:transform': [10, 0, 399960, 0, -10, 52000020] & 'proj:epsg': 32632 (UTM 32N)

  1. xy_coords=False xycoords_false

  2. xy_coords='center' | pixel shift: 0, 1 (expected: 0.5, -0.5) xycoords_center

  3. xy_coords='topleft' | pixel shift: -0.5, 0.5 (expected: 0, 0) xycoords_topleft

As expected, no shift is observed in (1). Now with (2) and (3), I'm confused. My original raster is a COG following the usual raster conventions, e.g. being top left aligned. Therefore I'd assume that using the default as in (3) results in no pixel shift at all. I'd also assume that the pixels are shifted by 0.5 and -0.5 in x and y respectively when using 'center' as in (2).

Looking forward to hearing some other thoughts on this!

Cheers, Marco

gjoseph92 commented 3 years ago

Thanks for the great issue @maawoo. I don't think you're doing anything wrong here; this is probably a bug. Which version are you using, btw? 1) I noticed https://github.com/gjoseph92/stackstac/pull/60 hasn't been released yet and 2) https://github.com/gjoseph92/stackstac/pull/35 from >= 0.2.0 is slightly related (though I highly doubt relevant here). Would it be possible for you to try with pip install git+https://github.com/gjoseph92/stackstac.git@main to see if https://github.com/gjoseph92/stackstac/pull/60 fixes it?

One other thing to be aware of is that I believe rioxarray follows xarray conventions (rather than typical raster conventions) and expects coordinates to refer to pixel centers. So I'd think (but haven't tried) that xy_coords='center' would actually give you (0, 0) offset in your output GeoTIFF, and xy_coords='topleft' would give you (0.5, 0.5).

maawoo commented 3 years ago

Hi Gabe, I was using 0.2.1 installed via pip install stackstac. Forgot to check for any relevant changes that might not have been released yet. Thanks for the reminder to do that in the future!

Unfortunately the problem persists after installing from git and #60 doesn't fix it, which also means that xy_coords='center' still results in an unexpected offset.

I did the following quick test regarding rioxarray with in.tif being the same raster file as described above:

test = rioxarray.open_rasterio('./in.tif')
test.rio.to_raster('./out.tif')

The output is still perfectly aligned afterwards, so I assume rioxarray/rasterio somehow handle the conversion between raster and xarray conventions internally. I would've been very suprised if not tbh.

I think that the error lies somewhere here because there's no offset problem with xy_coords=False, which skips the code block that I linked. Unfortunately I don't have much time to spare at the moment to look into this myself.

gjoseph92 commented 3 years ago

Yup, something is definitely wrong with the xy_coords logic. I'm really sorry about that; that's not the sort of thing you should have to question or debug as a user!

I also don't have a ton of time to look into this, but hopefully I can next week. If you could provide a full copy-pasteable reproducer, that would help a ton. What's your stac_obj?

maawoo commented 3 years ago

No problem and as a user I'm always happy to contribute to projects that I have a personal interest in!

You can find a stripped-down example scene here: https://upload.uni-jena.de/data/612dde1b411da9.40050697/xycoords.7z

I tested it with the following code, except that the asset href in the STAC Item needs to be changed to absolute to work with stackstac of course.

from pystac import Collection
import stackstac
import rioxarray

collection_path = '/path/to/collection.json'
stac_obj = Collection.from_file(collection_path)

scene_stack = stackstac.stack(items=stac_obj, chunksize=5120, xy_coords='center', dtype='float32', rescale=False)
vh = scene_stack.sel(band='vh')

vh.rio.to_raster('/.../out.tif')

Hope that helps!

Kirill888 commented 2 years ago

I think I'm observing something that is related: xy_coords='center' mode of operation computes Y coordinates incorrectly when Y resolution is negative. What I'm observing is that as I switch from default to `'center`` both X and Y are shifted by half a pixel away from 0, but Y needs to move towards 0 as Y axis is inverted.

Kirill888 commented 2 years ago

@maawoo I'm pretty sure .rio expects coordinates to be in center mode. But see my comment about offset going wrong way for Y axis when using center. You can check for inconsistency by comparing output of .rio.transform() with attribute reported by stackstac.

This is what I'm observing:

xx.attrs['transform'], xx.rio.transform()
(Affine(10.0, 0.0, 499980.0,
        0.0, -10.0, 9200020.0),
 Affine(10.0, 0.0, 499980.0,
        0.0, -10.0, 9200030.0))

which is consistent with your reports.

But if I apply a fix:

xx.y.data[:] = xx.y.data[:] - float(xx.resolution)

then I get consistent output from .rio.transform(). If you can try applying that fix and check if you get data round-trip without an offset then?

  1. Load with center into xx
  2. xx.y.data[:] = xx.y.data[:] - float(xx.resolution)
  3. Save with .rio, load back compare.