Open MartinSchobben opened 9 months ago
Tagging the VITO team @jdries @soxofaan @JeroenVerstraelen @VincentVerelst here, since they have it implemented and using it frequently.
As you noticed, I guess that the case with a kernel of size=2 and stride=1 is not covered by this definition, but I might be wrong.
For the scenario where your process takes as input a 3x3 patch, and computes one value out of it, you should set size=1 and overlap=1.
Anyway, I wait for some clarification from the others, since I also have some doubts on how the window/neighborhood "moves", so how is the stride
being set.
Official docs are here: https://processes.openeo.org/#apply_neighborhood
For the scenario where your process takes as input a 3x3 patch, and computes one value out of it, you should set size=1 and overlap=1.
As I understand it now this would yield a stride of 2.
I think the source of the confusion is that you should not aim for a "stride" in the apply_neighborhood
process itself.
apply_neighborhood
is just process in the apply_*
family of openeo processes that allows processing on spatial chunks. For practical reasons there is no spatial equivalent for apply_dimension
, it just wouldn't scale to let a user do something like apply_spatial
and let them e.g. run a UDF on a complete spatial extent.
Instead, apply_neighborhood
allows you to run on reasonable spatial chunks, with sizes in the order of (but not limited to) 128x128 or 512x512.
Within such a chunk you as a user are still allowed to implement some kind of operation that involves strides, but that should not be coupled to the apply_neighborhood
size/overlap parameters as you try to do above. At most, the overlap should probably be large enough to cover your target stride, to ensure completeness of your results.
I hope this clarifies some things already
Then I might have to rephrase a little. Given the xarray example above, which is a common spatial/image analysis operation, what would be the openEO equivalent for a rolling kernel?
Given the xarray example above, which is a common spatial/image analysis operation, what would be the openEO equivalent for a rolling kernel?
For that particular example: taking sum in 3x3 window and stepping 1px at a time, I would use apply_kernel
.
But if you deviate from the simplicity of that convolution example (e.g. taking larger strides, or computing a non-linear aggregation instead of sum), then that doesn't work anymore.
What further adds to the confusion is that the openEO cookbook says: a) that the function is a moving window, and b) that it allows supplying pre-defined processes like median
, max
, and sd
to spatial, temporal, and spatiotemporal "neighborhoods" (which can be interpreted as window in the context). This reads like apply_neighborhood
behaves similar as e.g. xarray.DataArray.rolling()
.
a) that the function is a moving window, and b) that it allows supplying pre-defined processes
that b part comes from the following I guess, but I'm not sure where that a part comes from
This cookbook snippet is indeed misleading.
Practically speaking, I think only run_udf
is the only useful pre-defined process you can directly pass to apply_neighborhood
. Theoretically you could also pass processes like apply_kernel
, atmospheric_correction
, ard_...
, ndvi
, mask
, ... but that makes no sense as it will be more efficient to apply them directly instead of going through apply_neighborhood
I just stumbled across this example in the introduction of openEO datacubes.
https://openeo.org/documentation/1.0/datacubes.html
Also here it is suggested that apply_neighborhood
represents a rolling window. Furthermore, the example also shows a temporal smoothing on a datacube (as opposed to the example here). This would contradict that the process is intended for spatial chunks only, as stated above:
apply_neighborhood
is just process in theapply_*
family of openeo processes that allows processing on spatial chunks.
You can indeed use apply_neighborhood
for doing a rolling window, indeed also along temporal dimension. But I think that the explanation at https://openeo.org/documentation/1.0/datacubes.html is misleading, e.g. it uses "window size=3" while the apply_neighborhood arguments corresponding with the example would be size=1 and overlap=1.
This would contradict that the process is intended for spatial chunks only, as stated above:
applyneighborhood is just process in the apply* family of openeo processes that allows processing on spatial chunks.
I didn't intend to say that it is for spatial chunking only. Historically, spatial chunking was the main goal, but temporal chunking was added as a logical generalization.
Going back to the beginning of this thread. If you look at apply_neighborhood
with a "rolling window" mindset, I think that the size
parameter corresponds to the stride of the rolling window. The overlap
parameters adds additional padding data to that.
I hope this clarifies some things already
This is an initial attempt to implement
apply_neighborhood
while adhering to the openeo specifications. We have some difficulty understanding the proper implementation of the parameterssize
andoverlap
. More specifically we would like to understand better how those parameters translate to the stride of the rolling window/kernel. In our current understanding the overlap argument modifies the size argument, to obtain a modified kernel/window size.In pseudo code:
$$ kernelsize = size + overlap * 2 $$
This definition implies that in order to obtain a rolling window/kernel with e.g., stride 1 and size 2 requires a 0-size and 1-overlap input for the openEO process. Or, for an instance with stride 1 and size 3, one needs negative size values, specifically -1 and 2 for the size and overlap parameters, respectively. This definition seems therefore impractical for stride 1 rolling windows/kernels. The unit tests of our PR include examples of these two instances in openEO with an input cube of size 4 by 4 filled with values 1 using the process
sum
.By contrast, in xarray a stride 1-rolling kernel can be defined as follows:
It feels like we are missing some important aspect but we can't tell what.
See also this animation of an 3 by 3 kernel with stride 1 from an article on medium.com authored by Paul-Louis Pröve.
Co-authored-by: @SwamyDev