Communicate log info - Githubissues

m-mohr commented 4 years ago

What seems to be missing from the communication protocol from the API instance and the UDF server is a way to send the log information so that they can be shown in the API logging endpoints. I think we need to add this.

cc @flahn

flahn commented 4 years ago

First, in my oppinion the feature would be really helpful, no question. But I have some doubts and remarks regarding prior design choices and performance. I will try to list my thoughts on that:

The UDF-API is designed to be synchronous. This means that the UDF encapsulates an atomic transaction, like performing a single custom function on a data chunk. With that we can scale the UDF services at the backend and potentially run many instances in parallel. Logging introduces quite some overhead here, you need to know how to store the logging information, where to store them, how to access them and how to relate them to a particular process (and data chunk).
We would need to introduce asynchronous processing where an UDF job is created and can be referenced by ID. The back-end needs to poll for results and / or for logging information.
We would need some configuration parameter at the UDF service to toggle the logging. Consider high-performance, low-cost computing vs. developing UDFs. This can be done with the current API via user or server context.

For the R UDF service this is quite impossible, because the plumber webservice just supports single threads, as R does in general unless certain packages are used. However, for data processing it is possible to paralellize, but not for the webservice request handling. This means logging information is only available after the computation is done and the service is not blocked by the next computation request. Still the log-to-computation addressing is a problem.

m-mohr commented 4 years ago

We discussed this a bit yesterday, and the conclusion basically was that without more error handling info developing a UDF is nearly impossible or at least very difficult for a user and thus UDFs would not get adopted widely. The issue mentioned is that at the moment for example Wageningen always has to contact the provider to ask what went wrong with their UDF because they don't get any info except that it did not work. That doesn't scale (in terms of number of users). So I think that needs to be addressed in the design of UDFs and might require some redesigns long-term (not saying it must be done now).

I'm not deep into R, but would think it might be possible to catch errors/exceptions for the UDF code and then return it to the back-end instead of the data response. That can be synchronous. Something like:

Back-end sends request to UDF server
UDF server accepts it and runs it like this (JS inspired): try { server.send(runUdfCode()); } catch (e) { server.send(error) }
back-end receives either valid data or error in json format.
If it's an error, use it for the logs.

That would at least capture errors. At some point it would be good to also transmit more log information with the data response, but one step at a time... ;-)

m-mohr commented 4 years ago

This is openeo-udf, thus my intention was to clarify (and improve) this for the UDF API, not necessarily speaking about any of the UDF API implementations.

jdries commented 3 years ago

Some ideas:

we started working on client side udf debugging. Ensuring that a UDF is at least valid python and can run is an important step.
In case of an error, the udf will throw an exception, and the backends can propagate that exception in the (batch job) logs, so there is a way for users to know what went wrong.
what we don't really support yet is print statements, we can however also return these through /logs, which would give even more information
Perhaps the UDF server can also get support for returning logs printed to stdout? Other option would be to actually pass in a python logger object, that the user can use to send messages. This would be an extension to the core UDF API. (PS: on holiday, so will probably not reply in short term)

m-mohr commented 3 years ago

we started working on client side udf debugging. Ensuring that a UDF is at least valid python and can run is an important step.

Sounds good! @flahn Are there similar plans for R?

In case of an error, the udf will throw an exception, and the backends can propagate that exception in the (batch job) logs, so there is a way for users to know what went wrong.

Is that somehow standardized in the API? Or is it just a 4xx/5xx error returned by the server that is simply written to the log?

what we don't really support yet is print statements, we can however also return these through /logs, which would give even more information

Perhaps the UDF server can also get support for returning logs printed to stdout? Other option would be to actually pass in a python logger object, that the user can use to send messages. This would be an extension to the core UDF API.

Yes, both sounds good. Best would be if the logger could expose the same object structure as in the API, then it could simply be "passed" through. stdout is a simple string and shouldn't be that hard to implement, I hope?

Open-EO / openeo-udf

Communicate log info #22