Closed m-mohr closed 2 years ago
I think, I had implemented something like that. Just use create_user_process
and set submit=FALSE
. The resulting object is a ProcessInfo
class which has the same representation as an openEO process.
Thank you, I'll look into it tomorrow, I just saw again that someone using the R client exported it "incorrectly" again. Wasn't there a graph or toJson function or something like that that may still export the old version? Also, this is nothing I expect you to fix soon (I can't expect anything anyway ;-) ), but is more a reminder for later...
Okay, the issue is the following code used in many examples (e.g. GEE + EURAC):
...
final = p$save_result(data = apply_linear_transform,format = formats$output$PNG)
graph = as(final,"Graph")
graph
The graph outputted is just the "graph" (example 2), not a complete process (example 1).
Such behavior should either be changed or not be promoted in examples.
Yes, it is as I imagined, you wan't to have an openEO process (JSON) from this graph. But as the name difference suggests you have to create a process by adding metadata to the graph using create_user_process
first. When you set the parameter submit=FALSE
you won't do the HTTP call, but an object (a process) is returned instead. This object is list based, but the class will state that it is something process related (don't know the exact class name, right now). If you print this object, it will get a nice visualization like every other openEO process or user-defined process. To get it as JSON you would need to call jsonlite::toJSON
and set force=TRUE
to treat the object as a simple list which it is. Additional parameter like auto_unbox
and pretty
might be set as well.
@m-mohr Can you point me to the examples where it is promoted, that you can 1:1 copy the graph into the Webeditor? Then I can change this.
I found it here, I'm not sure whether I found them all:
Would it make sense to implement as(..., "Process") instead of as(..., "Graph") or is create_user_process more reasonable?
Actually, as(final,"Graph")
was used again in the review meeting although it should be discouraged. I'd actually propose to completely remove as(final,"Graph")
and replace it with something to just print the full process or let as(final,"Graph") print the full process. Just the process graph without metadata will not be useful in the openEO world and will confuse users. @MilutinMM
I see the point, and as(graph_test1,"Graph")
is now removed from the script. Instead, I add the following code:
# create a bach job from the graph:
job_id = create_job(con = eurac, graph = graph_test1, title = "job_final_review_demo", description = "job_final_review_demo") # batch call, works on udf!
# get the JSON text of the process graph:
graph_info = create_user_process( graph_test1,id=job_id$id, submit = FALSE, con = eurac )
graph_json = jsonlite::toJSON(graph_info$process_graph)
print(graph_json)
and this is the result:
{"load_collection_ZHZZM4626F":{"process_id":["load_collection"],"arguments":{"id":["openEO_WUR_Usecase"],"spatial_extent":{"west":[-54.815],"south":[-3.515],"east":[-54.81],"north":[-3.51]},"temporal_extent":[["2017-01-01T00:00:00Z"],["2019-12-29T00:00:00Z"]],"bands":[["VH"]]}},"run_udf_ZIAXO9027Y":{"process_id":["run_udf"],"arguments":{"data":{"from_node":["load_collection_ZHZZM4626F"]},"udf":["# @require x:stars\nlibrary(bfast)\nlibrary(abind)\n# set_fast_options()\n# this works because band is a empty dimension\nx = adrop(x)\n\n# Define the pixel-wise function\nSpatialBFM = function(pixels)\n{\n lsts = ts(pixels, c(2017, 1), frequency=30.666667)\n bfastmonitor(lsts, 2019, formula=response~trend)$breakpoint\n}\n# either use adrop() to drop the band dimension... or include here to reduce.\n#StarsResult = st_apply(x, c(\"x\", \"y\", \"band\"), SpatialBFM, PROGRESS=TRUE)\nStarsResult = st_apply(x, c(\"x\", \"y\"), SpatialBFM, PROGRESS=TRUE)\n# deal with NA-a:\nStarsResult[is.na(StarsResult)] = -9999\n#\nStarsResult"],"runtime":["R"],"context":{}}},"save_result_VWHJC2286A":{"process_id":["save_result"],"arguments":{"data":{"from_node":["run_udf_ZIAXO9027Y"]},"format":["GTiff"],"options":{}},"result":[true]}}
then I used additional step to visualize the json in a web editor to get the following:
{
"load_collection_ZHZZM4626F": {
"process_id": [
"load_collection"
],
"arguments": {
"id": [
"openEO_WUR_Usecase"
],
"spatial_extent": {
"west": [
-54.815
],
"south": [
-3.515
],
"east": [
-54.81
],
"north": [
-3.51
]
},
"temporal_extent": [
[
"2017-01-01T00:00:00Z"
],
[
"2019-12-29T00:00:00Z"
]
],
"bands": [
[
"VH"
]
]
}
},
"run_udf_ZIAXO9027Y": {
"process_id": [
"run_udf"
],
"arguments": {
"data": {
"from_node": [
"load_collection_ZHZZM4626F"
]
},
"udf": [
"# @require x:stars\nlibrary(bfast)\nlibrary(abind)\n# set_fast_options()\n# this works because band is a empty dimension\nx = adrop(x)\n\n# Define the pixel-wise function\nSpatialBFM = function(pixels)\n{\n lsts = ts(pixels, c(2017, 1), frequency=30.666667)\n bfastmonitor(lsts, 2019, formula=response~trend)$breakpoint\n}\n# either use adrop() to drop the band dimension... or include here to reduce.\n#StarsResult = st_apply(x, c(\"x\", \"y\", \"band\"), SpatialBFM, PROGRESS=TRUE)\nStarsResult = st_apply(x, c(\"x\", \"y\"), SpatialBFM, PROGRESS=TRUE)\n# deal with NA-a:\nStarsResult[is.na(StarsResult)] = -9999\n#\nStarsResult"
],
"runtime": [
"R"
],
"context": {
}
}
},
"save_result_VWHJC2286A": {
"process_id": [
"save_result"
],
"arguments": {
"data": {
"from_node": [
"run_udf_ZIAXO9027Y"
]
},
"format": [
"GTiff"
],
"options": {
}
},
"result": [
true
]
}
}
As my main purpose was to present how a graph could look like, I find print(job_id)
even better and with more relevant detail. The question is if the graph in the print(job_id)
result below is ready to be copy-pasted to the web editor and QGIS:
Job ID: 949b1db4-5fad-4e8b-8ba1-aecb7b716bd2
Title: job_final_review_demo
Description: job_final_review_demo
Status: created
Created: 2020-12-04T13:27:38.161Z
Updated: 2020-12-04T13:27:38.161Z
Progress: 0
Plan: free
Costs:
Budget: ---
User defined process:
Process:
Summary:
Description:
Returns:
Stored process graph:
{
"run_udf_ZIAXO9027Y": {
"process_id": "run_udf",
"arguments": {
"data": {
"from_node": "load_collection_ZHZZM4626F"
},
"udf": "# @require x:stars\nlibrary(bfast)\nlibrary(abind)\n# set_fast_options()\n# this works because band is a empty dimension\nx = adrop(x)\n\n# Define the pixel-wise function\nSpatialBFM = function(pixels)\n{\n lsts = ts(pixels, c(2017, 1), frequency=30.666667)\n bfastmonitor(lsts, 2019, formula=response~trend)$breakpoint\n}\n# either use adrop() to drop the band dimension... or include here to reduce.\n#StarsResult = st_apply(x, c(\"x\", \"y\", \"band\"), SpatialBFM, PROGRESS=TRUE)\nStarsResult = st_apply(x, c(\"x\", \"y\"), SpatialBFM, PROGRESS=TRUE)\n# deal with NA-a:\nStarsResult[is.na(StarsResult)] = -9999\n#\nStarsResult",
"context": {},
"runtime": "R"
}
},
"save_result_VWHJC2286A": {
"result": true,
"process_id": "save_result",
"arguments": {
"data": {
"from_node": "run_udf_ZIAXO9027Y"
},
"format": "GTiff",
"options": {}
}
},
"load_collection_ZHZZM4626F": {
"process_id": "load_collection",
"arguments": {
"temporal_extent": [
"2017-01-01T00:00:00Z",
"2019-12-29T00:00:00Z"
],
"spatial_extent": {
"east": -54.81,
"south": -3.515,
"north": -3.51,
"west": -54.815
},
"id": "openEO_WUR_Usecase",
"bands": [
"VH"
]
}
}
}
No, print() is also giving the imcomplete process (can't be visualized in the Web Editor for example), so yet another place that should be fixed.
@m-mohr, @flahn:
I was just talking to MeterGroup and @tim-salabim. They ran into the same problem.
They used as(xxx, "Graph")
and tried to visualize in the webeditor.
I'm working towards having the process visualizations standalone as a component that it can be integrated like the Collections component for example, but it's not an easy task as it has quite a lot of dependencies so it will take a bit more time...
I'm coming a bit late to the discussion, sorry. "Graph" and "Process" (user defined or predefined) are different concepts. The graph as such is just the part of the process that tells the back-end what it has to do. Process is more, it contains also meta data about the graph like name, description, etc.
To add the meta data to the graph you have to create a process, hence create_user_process
. Usually that process is not really of interest for the user, because it should go straight to the back-end. But it seems that there is a bigger use case for that and I see what I can do. The best option would be indeed the webeditor components embedded in the RStudio viewer.
How is the python client handling this at the moment? @m-mohr @clausmichele
I've opened an issue a couple of days ago exactly about this for the python client https://github.com/Open-EO/openeo-python-client/issues/209
Both R client and Python client still work a bit like in API v0.4 and it seems both never fully adopted the new process schema internally. In 0.4 there was only process graphs without metadata, but a process graph without any metadata and without the wrapper is pretty much useless in API 1.0, so the clients should adopt that structure.
Just an update on this topic to pick up when we continue. On the create_user_process
version there are square brackets added where there shouldn't be any.
Process graph definition:
data = p$load_collection(id = collection,
spatial_extent = bbox,
temporal_extent = time_range,
bands = bands)
result = p$save_result(data = data, format = "NetCDF")
as()
ouptut:
process_graph = as(result, "Graph")
{
"load_collection_GXORC7003Z": {
"process_id": "load_collection",
"arguments": {
"id": "SENTINEL2_L1C_SENTINELHUB",
"spatial_extent": {
"west": 13.0974,
"east": 13.1003,
"south": 47.8242,
"north": 47.8257
},
"temporal_extent": [
"2018-01-01T00:00:00.000Z",
"2018-12-31T00:00:00.000Z"
],
"bands": [
"B04"
]
}
},
"save_result_SPVBN1400V": {
"process_id": "save_result",
"arguments": {
"data": {
"from_node": "load_collection_GXORC7003Z"
},
"format": "NetCDF",
"options": {}
},
"result": true
}
}
create_user_process()
output:
graph_info = create_user_process(result,
id = result$id,
submit = FALSE)
print(jsonlite::toJSON(graph_info, pretty = TRUE))
{
"id": {},
"process_graph": {
"load_collection_GXORC7003Z": {
"process_id": ["load_collection"],
"arguments": {
"id": ["SENTINEL2_L1C_SENTINELHUB"],
"spatial_extent": {
"west": [13.0974],
"east": [13.1003],
"south": [47.8242],
"north": [47.8257]
},
"temporal_extent": [
["2018-01-01T00:00:00.000Z"],
["2018-12-31T00:00:00.000Z"]
],
"bands": [
["B04"]
]
}
},
"save_result_SPVBN1400V": {
"process_id": ["save_result"],
"arguments": {
"data": {
"from_node": ["load_collection_GXORC7003Z"]
},
"format": ["NetCDF"],
"options": {}
},
"result": [true]
}
},
"parameters": [],
"returns": {
"schema": {
"type": ["boolean"],
"subtype": [],
"pattern": [],
"parameters": [],
"items": {
"type": {}
},
"minItems": [],
"maxItems": []
}
}
}
@przell, you can use the parameter auto_unbox=TRUE
to get rid of the brackets, when using toJSON
As for now, I have implemented the as(x,"Process")
function to create a simple user defined process representation. The returned object is of type "Process", but the underlying structure of predefined and user defined process are the same. You can print that object now to print the JSON object.
As a deprecation note: now that we can obtained a simple UDP on the client by coercion, we don't necessarily need the parameter submit
in create_user_process
any longer. It will still work, but I have added a note, when using the function to obtain a Process object.
Great! Could we also deprecate (and print a warning?) for as(x,"Graph")
?
I advise against it, because i use this internally to coerce into the process graph part of a UDP. I cannot differentiate if the user called the function explicitly or if it was called from an internal function, which would be annoying to read. However, I removed examples in the code documentation which used as(x,"Graph") and pointed out that this might be revoked from the function export of the package (087c9f3).
Okay, then a warning is not a good idea, but you could still deprecate it in the docs, right?
Kind of. I haven't found any option to denote deprecation in the documentation. That is, what I did https://github.com/Open-EO/openeo-r-client/blob/087c9f3a907c4d8714ee3594d219e52c72c1f94f/man/as.Graph.Rd#L29
I removed printing the process graph in UDPs and services. Instead I advertise / encourage the user to coerce it into a Process. Printing the process will then of course show a valid process graph. I hope that I have eliminated now all potential misuses. If not feel free to open this issue again or create a new one ;)
In general, but especially also in the QGIS plugin and the Web Editor, openEO processes in JSON need to be fully defined including their potential metadata (see example 1), i.e. it is not enough to just list the graph nodes (see example 2). Unfortunately, the R client and Python client export the graphs without their metadata. This leads reportedly to some confusion when users try to copy/paste the "incomplete" processes into the Web Editor, QGIS Plugin or the Hub for example, which then can't handle them properly. Therefore, I'd really like to see that the R and Python clients print out the full process (see example 1).
Example 1:
Example 2: