Open m-mohr opened 4 months ago
Complete example that I ended up with:
from openeo.processes import sqrt
data = connection.load_collection(
"ERA5",
spatial_extent = None,
temporal_extent = ["2000-01-01", "2000-01-02"],
bands = ["u10", "v10"]
)
u10 = data.band('u10') ** 2.01
v10 = data.band('v10') ** 2.01
data = u10 * v10
data = data.apply(process = lambda x: sqrt(x))
result = data.save_result("PNG")
What I'd like it to look like (or similar):
from openeo.processes import sqrt
data = connection.load_collection(
"AGERA5",
spatial_extent = None,
temporal_extent = ["2000-01-01", "2000-01-02"],
bands = ["u10", "v10"]
)
data = data ** 2.01
u10 = data.band('u10')
v10 = data.band('v10')
data = sqrt(u10 * v10)
result = data.save_result("PNG")
Also, why is there no sqrt function on the datacube class? :-)
data ** 2.01
indeed, **
is not supported in non-band-math mode.
Other operators like data * 4
or 3 / data
do work (i.e. translate to apply
call) , but **
is indeed a todo here:
**
in DataCube "apply" modefrom openeo.processes import sqrt ... data = data.apply(process = lambda x: sqrt(x))
This can be simplified to
# no import necessary
...
data = data.apply(process = lambda x: x.sqrt())
or even
# no import necessary
...
data = data.apply(process="sqrt")
Also, why is there no sqrt function on the datacube class? :-)
General answer: because the sqrt
process is defined on a number value not a data cube value, so it doesn't make sense to define sqrt as data cube method.
There are indeed some existing DataCube methods like logarithm
and log2
that violate that rule, but these are mostly legacy things.
I'm in doubt if it would be a good idea to provide a DataCube method for each openeo process that works on floats like sqrt, abs, ceil, floor, int, sgn, absolute, ... That would mean a lot of new methods with short names. While it would make the end user code more compact, I'm not sure it will make it more understandable.
E.g. we currently already have a bit of mixed behaviour:
# "Band-math" mode
cube.band("B02") + 5 # -> translates to reduce_dimension(dimension=bands, ...)
# merge cube mode
cube1 + cube2 # -> translates to merge_cubes(cube1, cube2, overlap_resolver=add)
# "Apply" mode
cube + 5 # -> translates to apply( x -> x+5)
The distinction is not obvious for the untrained eye (and it also not very easy to explain without getting too technical), so I'd prefer to promote a code style that keeps things more explicit.
For example for the apply mode:
cube.apply(lambda x: x+5)
The decoupling between the data cube and the apply
"callback" is more obvious here, keeping things tidier if the callback gets more complex.
So should the datacube processes such as log2 etc be deprecated? Otherwise it's just confusing why something exists on the datacube and something doesn't.
Indeed, it would be better to deprecated DataCube.log2()
and friends to reduce the confusion
but in any case, the main thing to fix here is to make sure it's possible to "apply" the power
process, both with
cube ** p
cube.power(p)
I'm trying to run
I'd expect that it computes the power for all vales in the data cube (i.e.
apply(process = power)
). Unfortunately I get: