UKGovernmentBEIS / inspect_ai

Inspect: A framework for large language model evaluations
https://inspect.ai-safety-institute.org.uk/
MIT License
567 stars 98 forks source link

should solver use original input or modified input? #17

Closed Lovkush-A closed 4 months ago

Lovkush-A commented 4 months ago

when trying hello world example, I see that solver uses the input modified by chain-of-thought:

image

Question: should the solver be using the original raw input or the modified input?

Lovkush-A commented 4 months ago

My instinct is the solver should only use the original raw input.

However, based on the code, it is not possible to access the original raw input, because the score function only has access to state, not the original inputs. Hence, I assume this was an intentional design choice. Regardless, would be interested in other people's thoughts!

    async def score(state: TaskState, target: Target) -> Score:
        # format the scoring template
        score_prompt = grading_template.format(
            question=state.input_text,
            answer=state.output.completion,
            criterion=target.text,
            instructions=instructions,
        )
sdtblckgov commented 4 months ago

Hi @Lovkush-A !

If you ever need it, you can recover the original input from the task state at any point by calling state.input_text. You can see that we are accessing it to get the original question in the example code above.

In the hello world example, we are actually using a list of solvers which we call a Plan, see here.

A plan is essentially a list of Solvers which run in sequence and each modify the task state. So, the very first solver in the plan (chain_of_thought) will indeed take in the original raw input, and modify the state in some way (in this case, it just adds a suffix to the original prompt encouraging step-by-step thinking). The state will then be passed to the second solver, which will use and modify it and pass it to the third, etc.

We are not prescriptive about how exactly you use or modify the task state, and so your solvers can either use the original or modified inputs at any point in the computation.

Hope this is helpful!

Lovkush-A commented 4 months ago

Thanks for detailed response!

If you ever need it, you can recover the original input from the task state at any point by calling state.input_text.

When you say 'original input', do you mean the unmodified input from the dataset or the input modified by the solver?

Hope this is helpful!

Yes it was, thanks! Good to make clear what the solver was doing in hello world example.

However, I do not think my main question was answered: do we want the scorer to have access to the unmodified input directly from the dataset or the scorer to access the input modified by the solver? My gut feeling is that the scorer should use the unmodified input, but it looks like scorers use the modified inputs instead.

If my question or anything else I am saying is unclear, please let me know! Happy to try to rephrase or use specific examples to make question clearer.

sdtblckgov commented 4 months ago

state.input should contain the unmodified input from the dataset as chat messages. state.input_text should contain the unmodified input cast to a string.

do we want the scorer to have access to the unmodified input directly from the dataset or the scorer to access the input modified by the solver?

It already does have access to both! Wrt what we want, again, we are not prescriptive about how you use solvers and state. Scorers just take in the state and the target, and return a score. You're free to use any part of the state, whether it be the unmodified input (state.input) or something generated by a solver.

Lovkush-A commented 4 months ago

Great! Thanks so much for explaining.

In the screenshot at top, taken from metadata tab, the '[Question]' has the modified input (with chain of thought), not the original input.

Question: in the hello world example, is the scorer using the original prompt or the modified prompt?

If the modified prompt, then I will need to spend more time to understand the codebase, because my current understanding is that the scorer for hello world example uses state.input_text for the question.

If the original prompt, does that mean the view is showing the incorrect text? If yes, then I will close this issue and create fresh one isolating the actual issue!

sdtblckgov commented 4 months ago

Ah, you're right, since the chain_of_thought solver modifies the chat messages, and input_text reads from the chat messages, the scorer will be using a modified prompt in this example, as you see in the logs.

If you wanted to ensure you retain the original input in a way that is accessible at all points throughout task computation, you could write a simple solver to put before chain_of_thought in your plan that clones the original input into, for example, the metadata field.

Lovkush-A commented 4 months ago

Thanks again for help and explanations. Reassuring to know I have not fundamentally misunderstood something!

So now the question is, what do we want the default behaviour to be? Right now the default is the scorer uses the modified inputs and most users are likely to stick to this default. I personally think this is not the intuitive default (but I'm also very new to evals). What is your/the team's feeling on what the default ought to be? Might be something we have to wait to hear from more users too

sdtblckgov commented 4 months ago

Since the chain of thought modification does actually change the form of the ideal model response, I think it's a fair default to pass the modified prompt to the scorer instead of the original one.

But I don't claim to be speaking for AISI or inspect as a whole here. I think in general, the position inspect is taking is to not be too prescriptive about how specific evals are implemented, and provide a framework where the user has a lot of flexibility to implement them as they choose.

Lovkush-A commented 4 months ago

Thanks for your thoughts on all this! Interesting discussion.

Agreed that the framework should not be opinionated and let users decid. But in this instance I believe:

This is just my belief - happy to be proved incorrect in both cases!

sdtblckgov commented 4 months ago

I think for now, this is fine as a default. I think it is expected behaviour under normal assumptions.

You're correct that ensuring the user has access to a non-modified version of the original inputs at all times is a little fiddlier than it could be, and might be a nice feature.

Will look into adding a facility like this to the TaskState in future versions and get back to you.

aisi-inspect commented 4 months ago

I think this might actually be a bug in that input and input_text are supposed to be reliable ways to get to the original sample input. I think the else clause of input_text should actually be looking at self._input rather than self.messages:

https://github.com/UKGovernmentBEIS/inspect_ai/blob/main/src/inspect_ai/solver/_solver.py#L90-L98

Note that in these docs we actually claim that input_text is guaranteed to be the user's original input (look past the code snippet to the commentary below): https://ukgovernmentbeis.github.io/inspect_ai/scorers.html#example-model-grading

@sdtblckgov I will put-up a PR to fix this and tag you as a reviewer.

aisi-inspect commented 4 months ago

Bug addressed here: https://github.com/UKGovernmentBEIS/inspect_ai/commit/140c3d63ad42125213bffaa799147135c81857f3