Open jdries opened 11 months ago
Interesting use case. In a similar scenario, I did retrieve first the temporal labels and then compute the date difference client side, since I didn't know how to do everything using openEO processes.
You could compute the differences only once and pass them in via the context into reduce_dimension, right? (probably not valid Python client code, but should visualize the idea good enough)
def weighting(x, index, label, context):
return date_difference(x, context[index+1])
dates = dmp_cube.dimension_labels('t')
weights = array_apply(dates, weighting, context = dates)
def reducer(data, context):
return sum(array_combine(data, context, 'multiply'))
weighted_dmp.reduce_dimension(dimension='t',reducer=reducer, context = weights)
Would this be faster/better?
Additional questions:
array_combine is a new process which I thought could be useful, but can be emulated via array_apply. It takes two arrays and merges them using a reducer that accepts two values, such as multiple or add.
array_combine is basically:
def combine(x, index, label, context):
return multiply(x, context[index])
combined = array_apply(array1, combine, context = array2)
Could alternatively also accept an array of arrays and then work with array functions, so sum instead of multiply.
A use case requires us to sum a band over an irregular time dimension. To do this correctly, the number of days between observations needs to be taken into account.
The question here is if we require a new process, for convenience, or if we can define a process graph that solves this.
This is somewhat similar to: https://docs.xarray.dev/en/stable/generated/xarray.DataArray.integrate.html
I made an attempt to solve this with existing processes, but couldn't verify it yet because our backend doesn't support all the details yet. It also has the downside that it is hard to optimize for the backend: we apply a function over the whole time dimension, while we only need information about the next label: