Closed evaline-ju closed 1 month ago
Hello, should results be sorted by the order detectors are listed?
Consider the example below.
Sample request:
{
"model_id": "my_awesome_llm",
"prompt": "What is the capital of Brazil?",
"detectors": {
"answer_relevance": {},
"answer_correctness": {}
}
}
Response:
{
"generated_text": "The capital of Brazil is Rio de Janeiro.",
"detections": [
{
"detection_type": "correctness",
"detection": "answer_correctness",
"score": 0.00027292477898299694
},
{
"detection_type": "relevance",
"detection": "answer_relevance",
"score": 0.9994032433605752
}
],
"input_token_count": 7
}
The order in which detectors are listed in the request is answer_relevance
, then answer_correctness
, but the response returns detections in the reverse order.
Is this acceptable?
Some context (but unnecessary for the question): answer_relevance
checks if the model answers what was asked. while answer_correctness
checks if the answer is factually correct. In this case, the model did answer the question asked (relevance), but the question was answered wrong (correctness), as the capital of Brazil is Brasilia.
The order in which detectors are listed in the request is answer_relevance, then answer_correctness, but the response returns detections in the reverse order.
Is this acceptable?
I think this is fine for now considering that for other existing endpoints, we do not specifically order detection results by detector order in the map, since users should be able to tell which score result corresponds to which detector requested via detection/detection_type. If users find this a limitation, this could be addressed in the future. Other maintainers @gkumbhat, @declark1 can feel free to chime in.
Description
As an orchestrator user, I want to be able to use orchestrator to call detectors that require both user prompts and generated text from LLMs, so that I can use detectors in my pipeline that may require both as input.
This
/api/v1/text/task/generation-detection
endpoint is expected to work with the Detector API/text/generation
endpoint which primarily needs userprompt
and generated text from the LLM (model_id
) provided in Orchestrator APIDiscussion
Since the generated text needs to be present for the detector, we will only assume unary calls to the LLM for now (no streaming - otherwise aggregation and awaiting the generation stream would have to happen anyway)
Acceptance Criteria