Open ShaojieJiang opened 2 weeks ago
Thank you Shaojie! This is indeed useful to have and a good start, we'd love to merge this once it's in shape!
Something I'm not too sure about: Do you really need the history injection? e.g., in your BlackboxLLMWithHistory
class, couldn't you directly pass a list of messages (including the variable at hand and the history) in the forward pass to _generate_from_multiple_input
without using this new inject history function?
This is effectively what you are doing under the hood, but I think you could just extend the forward pass of LLM Call to pass not only the variable but also the history as a list to _generate_from_multiple_input
.
Hi @mertyg , thanks for your swift response!
Good point! My short answer is that history injection is easier to implement.
tg.Variable
I've thought about a more proper implementation. I considered passing the history as the content of tg.Variable
. However, Variable only accept str
type (for text) and byte
(for other modality?). It might work if I just specify a list as the content, but it's too hacky and a more proper implementation requires a subclass of Variable, and this requires a better understanding of Variable, especially on how gradients are handled. And this is not the end of the story. If we go along this way, think about this not-so-edgy case:
Counting these factors in, history injection makes it much easier to fulfill the needs.
If I pass a list of Variable's as the input, then I need to modify LLMCall, but again, this is more complex to implement and has broader implications.
Hi TextGrad developers,
First of all, thanks a lot for this great work!
I wonder do you have any plans to allow specifying chat history before the LLM calls? Below is some context explaining why this is important to me.
My task
Optimise the system prompt so that the chatbot behaves in a more controlled way as an interviewer. Therefore, the objective is not only to get a single good answer but also good responses after several turns of conversation.
Why can't I put the chat history in the prompt?
Most LLMs would understand it well if I put chat history in the prompts. However, to rule out any misalignment with inference, it's better to add history in the idiosyncratic way of each LLM. E.g., a list of dicts for OpenAI models.
MWE of my solution
Below is my solution. Although it's working, it looks a bit hacky so I'm calling it "history injection". Looking forward to your comments for a more proper implementation.
Best regards, Shaojie Jiang