Closed LukeWeidenwalker closed 2 years ago
the workflow for reduce dimension is now updated in order to also use predict_random_forest
inside of reduce_dimension
. The reduce_dimension
has got a parameter context
. In order to use reduce_dimension
properly, we need the context (dict) that specifies the model
used in the prediction and the predictors_vars
.
As discribed above, I now tried to use the reduce_dimension
process like this:
rf_regr_model = PGNode("load_ml_model",{"model":"jb-22d7ad56-30bd-416a-9221-bb191c1e8a7c"})
boa_sentinel_2_cube = conn.load_collection( collection_id = "boa_sentinel_2", spatial_extent = {"west":10.454955, "east":10.537297, "south":46.102185, "north":46.123657}, temporal_extent = ["2018-05-01", "2018-05-10"], bands = ["B02","B03","B04","B08"] )
reduced_prediction = boa_sentinel_2_cube.reduce_dimension(reducer="predict_random_forest", dimension="bands", context={"model": rf_regr_model, "predictors_vars": ["B02", "B03", "B04", "B08"]})
prediction_netcdf = reduced_prediction.save_result(format="netCDF")
when I try to send this with
job = prediction_netcdf.create_job(title="UC8_predict_rf_regr")
I get the following error message:
TypeError Traceback (most recent call last)
/tmp/ipykernel_10364/1345881284.py in
~/python/SRR1_notebooks/vrtlnvrnmnt/lib/python3.8/site-packages/openeo/rest/datacube.py in create_job(self, out_format, title, description, plan, budget, job_options, **format_options)
1641 # add save_result
node
1642 img = img.save_result(format=out_format, options=format_options)
-> 1643 return self._connection.create_job(
1644 process_graph=img.flat_graph(),
1645 title=title, description=description, plan=plan, budget=budget, additional=job_options
~/python/SRR1_notebooks/vrtlnvrnmnt/lib/python3.8/site-packages/openeo/rest/connection.py in create_job(self, process_graph, title, description, plan, budget, additional) 1069 req["job_options"] = additional 1070 -> 1071 response = self.post("/jobs", json=req, expected_status=201) 1072 1073 if "openeo-identifier" in response.headers:
~/python/SRR1_notebooks/vrtlnvrnmnt/lib/python3.8/site-packages/openeo/rest/connection.py in post(self, path, json, kwargs) 171 :return: response: Response 172 """ --> 173 return self.request("post", path=path, json=json, allow_redirects=False, kwargs) 174 175 def delete(self, path, **kwargs) -> Response:
~/python/SRR1_notebooks/vrtlnvrnmnt/lib/python3.8/site-packages/openeo/rest/connection.py in request(self, method, path, headers, auth, check_error, expected_status, **kwargs) 100 ) 101 with ContextTimer() as timer: --> 102 resp = self.session.request( 103 method=method, 104 url=url,
~/python/SRR1_notebooks/vrtlnvrnmnt/lib/python3.8/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json) 526 hooks=hooks, 527 ) --> 528 prep = self.prepare_request(req) 529 530 proxies = proxies or {}
~/python/SRR1_notebooks/vrtlnvrnmnt/lib/python3.8/site-packages/requests/sessions.py in prepare_request(self, request) 454 455 p = PreparedRequest() --> 456 p.prepare( 457 method=request.method.upper(), 458 url=request.url,
~/python/SRR1_notebooks/vrtlnvrnmnt/lib/python3.8/site-packages/requests/models.py in prepare(self, method, url, headers, files, data, params, auth, cookies, hooks, json) 317 self.prepare_headers(headers) 318 self.prepare_cookies(cookies) --> 319 self.prepare_body(data, files, json) 320 self.prepare_auth(auth, url) 321
~/python/SRR1_notebooks/vrtlnvrnmnt/lib/python3.8/site-packages/requests/models.py in prepare_body(self, data, files, json) 469 470 try: --> 471 body = complexjson.dumps(json, allow_nan=False) 472 except ValueError as ve: 473 raise InvalidJSONError(ve, request=self)
/usr/lib/python3.8/json/init.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw) 232 if cls is None: 233 cls = JSONEncoder --> 234 return cls( 235 skipkeys=skipkeys, ensure_ascii=ensure_ascii, 236 check_circular=check_circular, allow_nan=allow_nan, indent=indent,
/usr/lib/python3.8/json/encoder.py in encode(self, o) 197 # exceptions aren't as detailed. The list call should be roughly 198 # equivalent to the PySequence_Fast that ''.join() would do. --> 199 chunks = self.iterencode(o, _one_shot=True) 200 if not isinstance(chunks, (list, tuple)): 201 chunks = list(chunks)
/usr/lib/python3.8/json/encoder.py in iterencode(self, o, _one_shot) 255 self.key_separator, self.item_separator, self.sort_keys, 256 self.skipkeys, _one_shot) --> 257 return _iterencode(o, 0) 258 259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
/usr/lib/python3.8/json/encoder.py in default(self, o) 177 178 """ --> 179 raise TypeError(f'Object of type {o.class.name} ' 180 f'is not JSON serializable') 181
TypeError: Object of type PGNode is not JSON serializable
Am I allowed to use a reference like rf_regr_model
inside of the context
of reduce_dimension
?
@soxofaan have you maybe seen this before? If not, maybe you could give me a hint on whom to ask about this :)
Why not using the more simple example that I sent:
predicted = boa_sentinel_2_cube.predict_random_forest(
model="jb-22d7ad56-30bd-416a-9221-bb191c1e8a7c",
dimension="bands"
)
If that doesn't work, check out the implementation openeo-python-client/openeo/rest/datacube.py:1814 In any case, avoid using PGNode directly, unless if you really know what you're doing. (It's not the kind of code you want to have ending up in a demonstration.)
Indeed a regular user should not have to create PGNode
objects themself, if not that's a bug or missing feature in the python client.
In version 0.10.0 Connection.load_ml_model()
was added for your use case (https://open-eo.github.io/openeo-python-client/api.html?highlight=load_ml_model#openeo.rest.connection.Connection.load_ml_model)
A couple of notes about
cube.reduce_dimension(reducer="predict_random_forest", dimension="bands", context={"model": rf_regr_model, "predictors_vars": ["B02", "B03", "B04", "B08"]})
DataCube.predict_random_forest
, which is indeed defined on the level of a data cube, but internally translates that to a reduce_dimension
with predict_random_forest
reducercontext
argument technically can't work yet as discussed at https://github.com/openEOPlatform/architecture-docs/issues/213 : there is no openeo process (yet) to "get" the model again out of the context
parameter. In the VITO implementation we do context=model
(which means it's impossible to inject additional context data, like the "predictors_vars" in your use case). Another alternative is putting model
and predictor_vars
in an array and pass that to context
and extracting them again with array_element
.Thank you very much for the quick replies! I will see how I can update our implementation then, so that I can use the context=model
properly and avoid using PGNode
and the predictors_vars
.
Currently
predict_random_forest
takes the dimension over which to apply random_forest inference. As discussed in e.g. https://github.com/Open-EO/openeo-processes/issues/295#issuecomment-993393971, this isn't the intention of the spec, ratherpredict_random_forest
should be implemented as a reducer that can be called across any dimension.