philipmit / thread

7 stars 2 forks source link

Why you did not use Tool Calling ? #1

Open ManuelRios18 opened 1 month ago

ManuelRios18 commented 1 month ago

Hello, first of all thanks for open sourcing your code!

We notice you did not use tool calling in your implementation. Is there a reason for that? Have you experimented with this?

Your response will be a huge help to guide our next steps in a project my team is currently working on.

Thanks.

philipmit commented 1 month ago

hi! did you have something specific in mind? In general, THREAD offers greater flexibility than standard tool calling since the recursive calls can involve any arbitrary sub-task specification for a child thread to complete. Our goal was to implement it in the simplest way possible, but would love to hear any thoughts you may have!

ManuelRios18 commented 2 weeks ago

Hello, Maybe I was not clear enough in my previous post, I apologize for that. What I mean is that we noticed that in your implementation when you use chat models (e.g,gpt-4) you guys use this models as completion models.

In Thread you rely on specific stop tokens (e.g,w_listen) to handle the call to new child threads. Once the generation of a parent thread stops, you call the child and append its response to the text generated by its parent (Y). Then the parent thread continues with the generation which now has the response from its child included.

In our opinion, this perfectly fits with a text completion task making models like gpt-3.5-turbo-instruct and text-davinci-002 suitable for this strategy. However, when using chat models the paradigm changes. In this case, we expect the model to receive interleaving messages and respond with an assistant message. In your implementation this is not the case for chat models, the model is always seeing a single user message with a huge prompt containing all examples and the instruction we expect the model to complete.

We consider that a more suitable way to implement this in chat models is using tool calling, teaching the model that it can call a child thread by using a special tool. In this way, when the model uses this tool, the generation stops and waits for the tool response which is perfectly compatible with Thread strategy.

Do you agree with us?