Closed dghirardo closed 2 months ago
Before:
irb(main):003> Langchain::Tool::NewsRetriever.new(api_key: ENV["NEWS_API_KEY"]).to_openai_tools
=>
[{"type"=>"function",
"function"=>
{"name"=>"news_retriever__get_everything",
"description"=>"News Retriever: Search through millions of articles from over 150,000 large and small news sources and blogs.",
"parameters"=>
{"type"=>"object",
"properties"=>
{"q"=>
{"type"=>"string",
"description"=>
"Keywords or phrases to search for in the article title and body. Surround phrases with quotes (\") for exact match. Alternatively you can use the AND / OR / NOT keywords, and optionally group these with parenthesis. Must be URL-encoded."},
"search_in"=>{"type"=>"string", "description"=>"The fields to restrict your q search to.", "enum"=>["title", "description", "content"]},
"sources"=>
{"type"=>"string",
"description"=>"A comma-seperated string of identifiers (maximum 20) for the news sources or blogs you want headlines from. Use the /sources endpoint to locate these programmatically or look at the sources index."},
"domains"=>{"type"=>"string", "description"=>"A comma-seperated string of domains (eg bbc.co.uk, techcrunch.com, engadget.com) to restrict the search to."},
"exclude_domains"=>{"type"=>"string", "description"=>"A comma-seperated string of domains (eg bbc.co.uk, techcrunch.com, engadget.com) to remove from the results."},
"from"=>{"type"=>"string", "description"=>"A date and optional time for the oldest article allowed. This should be in ISO 8601 format."},
"to"=>{"type"=>"string", "description"=>"A date and optional time for the newest article allowed. This should be in ISO 8601 format."},
"language"=>{"type"=>"string", "description"=>"The 2-letter ISO-639-1 code of the language you want to get headlines for.", "enum"=>["ar", "de", "en", "es", "fr", "he", "it", "nl", "no", "pt", "ru", "sv", "ud", "zh"]},
"sort_by"=>{"type"=>"string", "description"=>"The order to sort the articles in.", "enum"=>["relevancy", "popularity", "publishedAt"]},
"page_size"=>{"type"=>"integer", "description"=>"The number of results to return per page (request). 5 is the default, 100 is the maximum."},
"page"=>{"type"=>"integer", "description"=>"Use this to page through the results if the total results found is greater than the page size."}}}}},
After:
irb(main):013> Langchain::Tool::NewsRetriever.new(api_key: ENV["NEWS_API_KEY"]).to_openai_tools
=>
[{"type"=>"function",
"function"=>
{"name"=>"news_retriever__get_everything",
"description"=>"Retrieve all news",
"parameters"=>
{"type"=>"object",
"properties"=>
{"q"=>{"type"=>"string", "description"=>"Keywords or phrases to search for in the article title and body."},
"search_in"=>{"type"=>"string", "description"=>"The fields to restrict your q search to. The possible options are: title, description, content."},
"sources"=>
{"type"=>"string",
"description"=>"A comma-seperated string of identifiers (maximum 20) for the news sources or blogs you want headlines from. Use the /sources endpoint to locate these programmatically or look at the sources index."},
"domains"=>{"type"=>"string", "description"=>"A comma-seperated string of domains (eg bbc.co.uk, techcrunch.com, engadget.com) to restrict the search to."},
"exclude_domains"=>{"type"=>"string", "description"=>"A comma-seperated string of domains (eg bbc.co.uk, techcrunch.com, engadget.com) to remove from the results."},
"from"=>{"type"=>"string", "description"=>"A date and optional time for the oldest article allowed. This should be in ISO 8601 format."},
"to"=>{"type"=>"string", "description"=>"A date and optional time for the newest article allowed. This should be in ISO 8601 format."},
"language"=>{"type"=>"string", "description"=>"The 2-letter ISO-639-1 code of the language you want to get headlines for. Possible options: ar, de, en, es, fr, he, it, nl, no, pt, ru, se, ud, zh."},
"sort_by"=>{"type"=>"string", "description"=>"The order to sort the articles in. Possible options: relevancy, popularity, publishedAt."},
"page_size"=>{"type"=>"integer", "description"=>"The number of results to return per page. 20 is the API's default, 100 is the maximum. Our default is 5."},
"page"=>{"type"=>"integer", "description"=>"Use this to page through the results."}},
"required"=>[]}}},
Upon a quick glance -- the enum:
param is missing. I wonder if we can annotate enum options in YARD docs 🤔
@dghirardo Btw -- @palladius told me that you guys met at the RubyDay! 🫶
@andreibondarev I am currently working on a pull request for the tool_tailor
gem to add support for the enum
parameter. What do you think about the proposed solution?
Hi @andreibondarev, tool_tailor
has merged my pull request, so now the enum
property is supported, updating the gem. However, I noticed another issue. When you have a parameter that is a collection (like an array or a hash), you can't describe its internal structure.
Example (Tavily tool):
# @param exclude_domains [Array<String>] A list of domains to specifically exclude from the search results. Default is None, which doesn't exclude any domains.
While [Array]
is supported, [Array<String>]
is not.
So, it’s not possible to:
enum
.You can only use the description field.
Even if we find a solution, I think nested parameters with multiple levels are not easy to describe with YARD comments.
Hi @andreibondarev,
tool_tailor
has merged my pull request, so now theenum
property is supported, updating the gem. However, I noticed another issue. When you have a parameter that is a collection (like an array or a hash), you can't describe its internal structure.Example (Tavily tool):
# @param exclude_domains [Array<String>] A list of domains to specifically exclude from the search results. Default is None, which doesn't exclude any domains.
While
[Array]
is supported,[Array<String>]
is not. So, it’s not possible to:
- Define that the array items must be Strings.
- Describe extra properties for the String values, such as
enum
.You can only use the description field.
Even if we find a solution, I think nested parameters with multiple levels are not easy to describe with YARD comments.
Hmm... I think we may want to consider some sort of a decorator pattern, a la:
llm_callable :get_everything,
description: "News Retriever: Search through millions of articles from over 150,000 large and small news sources and blogs",
properties: {
q: {
type: :string,
description: "Keywords or phrases to search for in the article title and body. Surround phrases with quotes (\") for exact match. Alternatively you can use the AND / OR / NOT keywords, and optionally group these with parenthesis. Must be URL-encoded."
}
def get_everything(q:, ...)
...
The really tricky thing here is figuring out a developer-friendly interface.
Happy to add support for it in tool_tailor
. Let me know if that would work. I think I want to support more use cases going forward as well.
Hi @andreibondarev, as agreed, since I have opened a new pull request with the new approach, I'm closing this one.
Issue #637
This pull request implements the
tool_tailor
gem (https://github.com/kieranklaassen/tool_tailor) to automate the generation of tool annotations.Changes:
:ANNOTATIONS_PATH
constant has been replaced with:FUNCTIONS
, an array of symbols corresponding to the tool's functions.:NAME
,:FUNCTIONS
) have been implemented to ensure consistency and correctness.