Open jdries opened 3 months ago
The problem is that there's a mismatch between a synchronous request (in which you want to return the OpenEO-Costs header) and something asynchronous like a SparkListener.
Perhaps we can not use a listener, and just do a rest api request before returning the result. (When the result is ready, the job is anyway finished, we don't really need a listener for that, and I guess we also know the request id at that point.)
Makes me wonder if there's a Scala API we can use and do away with the REST call (it only makes sense that the latter is implemented in terms of the former). :thinking:
yeah, I also searched for that scala api quite a bit. If it exists, it's probably going to be internal spark api...
(putting some keywords here so can more easily find this ticket next time :smile: ETL API credit credits billing)
We currently can not accurately measure usage of sync requests. Via a detour however, we might be getting there: Spark has a json api, example requests are below. I already configured jobGroup to point to request id (not always filled in,can also be empty). From there we get stage ids, and stage id json contains cpu time!
So we can either do a spark listener for finished jobs, or somehow set up an external scraper for jobs endpoint?
https://spark-ui-0.stag.warsaw.openeo.dataspace.copernicus.eu/api/v1/applications/spark-1be174b9cf8248438bf18097063c7f54/jobs/0
https://spark-ui-0.stag.warsaw.openeo.dataspace.copernicus.eu/api/v1/applications/spark-1be174b9cf8248438bf18097063c7f54/stages/4?details=false