Closed m1ci closed 8 years ago
Yes, this might be useful. Then it is just question how the response should look like. Now it just returns what the last service returns. So options are:
Or maybe some other. Anyway, good idea!
IMO the first option is better from practical point of view. Wrap the response in metadata
part and content
of the "real" response.
This is implemented. If you add the request parameter stats=true
, the response is wrapped.
Example:
curl -vX POST --header "Content-Type: application/json" --header "Accept: application/json" -d "@pipeline.json" "http://api-dev.freme-project.eu/current/pipelining/chain?stats=true"
Where @pipeline.json
is a pipeline description (see the chain method for an example).
The result looks like:
{
"metadata": {
"serviceToDuration": {
"http://api-dev.freme-project.eu/current/e-entity/dbpedia-spotlight/documents":1461,
"http://api-dev.freme-project.eu/current/e-link/documents/":4299,
"http://api-dev.freme-project.eu/current/e-translation/tilde":1963
},
"totalDuration":7723
},
"content": {
"body": ...
...
}
}
Here the metadata
contains a mapping serviceToDuration
which contains how long each separate service takes to process its request. totalDuration
is the time to process the complete pipeline. Times are in milliseconds.
content
contains the original pipeline response.
If this is OK, I will post an issue to add this in the API documentation.
This also works on templates, E.g.:
curl -vX POST --header "Content-Type: text/plain" --header "Accept: application/json" -d "Brussels is the capital of Belgium." "http://api-dev.freme-project.eu/current/pipelining/chain/37?stats=true"
I like that this is displayed only via parameter "stats=true" so it is optional. How is it implemented? I think it could be done as a general filter so it is automatically enabled on every endpoint.
I like that this is displayed only via parameter "stats=true" so it is optional. How is it implemented?
Right now it is pretty hard-coded. The pipelines code returns a result with statistics anyway, and in the Broker the statistics are filtered out or kept depending on the value of the parameter. See the method pipeline() (it's not many lines of code ;) ).
I think it could be done as a general filter so it is automatically enabled on every endpoint.
Indeed. Then we also need a generic way to gather statistics...
Well actually when I think about it the general filter does not make so much sense. The execution time of a request is only of interest in the pipelines. When a user is interested in the execution time of a single request he can measure it himself and does not need the stats parameter.
"serviceToDuration" in the response is now called "serviceDurations".
@m1ci If you are OK with how it looks, can you close the issue, or comment if not?
nice work. Small remark on the naming - instead of "serviceDurations" I would call it "executionTime"
Done. Example:
{"metadata": {
"executionTime": {
"http://api.freme-project.eu/current/e-entity/dbpedia-spotlight/documents":1701,
"http://api.freme-project.eu/current/e-translation/tilde":5284
},
"totalExecutionTime":6985 },
"content":{ ... }
}
great, fine by me. If there is nothing more to discuss just go ahead and close the issue.
Just an idea for further development: It can be useful if the client get information about the time required for execution of the whole pipeline and processing time of each service. This will help to find the bottleneck in the processing.