clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
19 stars 26 forks source link

[framework] remove special handling of programmatic Players #60

Open phisad opened 4 months ago

phisad commented 4 months ago

Now that we have a proper Model representation, the programmatic player should actually simply be given a CustomResponseModel that implements the generate method (instead of the switch of calling a different method on the player call). Then we can get rid of this bit:

    def __call__(self, messages: List[Dict], turn_idx) -> Tuple[Any, Any, str]:
        call_start = datetime.now()
        if isinstance(self.model, CustomResponseModel):
            prompt, response, response_text = messages, {"response": "programmatic"}, \
                self._custom_response(messages, turn_idx)
        elif isinstance(self.model, HumanModel):
            prompt, response, response_text = messages, {"response": "human"}, \
                self._terminal_response(messages, turn_idx)
        else:
            prompt, response, response_text = self.model.generate_response(messages)
        call_duration = datetime.now() - call_start
        response["duration"] = str(call_duration)
        return prompt, response, response_text

becomes

    def __call__(self, messages: List[Dict], turn_idx) -> Tuple[Any, Any, str]:
        call_start = datetime.now()
        prompt, response, response_text = self.model.generate_response(messages)
        call_duration = datetime.now() - call_start
        response["duration"] = str(call_duration)
        return prompt, response, response_text

This is breaking because implementors need to move their _custom_response to the generate method of a custom response model e.g.

class Questioner(CustomResponseModel):
    """Programmatic realisation of the Questioner player."""
    def __init__(self, exp_name: str, max_turns: int, question_order: List[str], requests: Dict[str, int]):
        super().__init__()
        request_strings = load_json(REQUESTS_PATH.format(exp_name), GAME_NAME)
        self.max_turns = max_turns
        self.question_order = question_order
        self.requests = requests
        self.request_strings = request_strings

    def generate_response(self, messages: Any) -> str:
        """Return the request utterance for a given turn."""
        if turn_idx >= self.max_turns:
            raise IndexError('Maximum turns already reached!')
        question_type = self.question_order[turn_idx]
        request_idx = self.requests[question_type]
        return self.request_strings[question_type][request_idx]

and Player would be a simple wrapper around the Model

self.questioner = Player.of(Questioner())

This requires that turn_idx is available to generate_response. Thus blocked by #39 .