Question - Difference between agents vs pipelines (re-visited).. use-case and real-world examples unclear

gidzr commented 9 months ago

Hi @SBrandeis, @julien-c, @scruel, @Rocketknight1

This is a follow-up to the line of query of 'what's the purpose or use-case for agents/tools' here: https://github.com/huggingface/transformers/issues/26195

This is triggered by the recent press and a lot of (irrational) hype around "Rabbit" (https://www.rabbit.tech) which I'm predicting is either smoke and mirrors, or at the very most, a standard LLM with agent/tool macros or LLM stacked with zapier macros (lolz).

Before heading down my own "rabbit" hole to combine macros with an LLM, I wanted to re-visit Agents to check if Agents/Tools will have a role in my stack.

Specifically your 'hello world' examples are awesome for quick start, but don't help to conceptually to understand what Agents/Tools are capable of and how these differ from pipelines.

I've read over: https://github.com/huggingface/huggingface.js/blob/main/packages/agents/README.md https://github.com/huggingface/transformers/tree/main/src/transformers/tools https://huggingface.co/docs/transformers/main_classes/agent https://huggingface.co/docs/huggingface.js/agents/modules https://huggingface.co/docs/transformers/custom_tools#adding-new-tools

For "Agents" It looks as though agents are primarily doing the same thing as a pipeline. You have to create a prompt and select the model. eg.

const code = await agent.generateCode("Draw a picture of a cat, wearing a top hat.")

For "Custom Tools section of Agents" This example looks closer to what I would term an 'agent', in that it's taking a goal, generating a macro or algorithm in code which can be executed as a callback. eg.

agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder", additional_tools=[tool]) agent.run( "Can you read out loud the name of the model that has the most downloads in the 'text-to-video' task on the Hugging > Face Hub?" )

Questions

Is the HF Agent and Tools limited to "prompts" only, to instruct LLMs to create outputs and provide these back to main script (as callbacks), OR do the Agents/Tools taken the LLM output from the prompts and then run separate processes over these then providing only the very end output back to the main script?
Is there a list of Agent/Tool pre-built processes eg. like the list of tasks for pipeline, that I can peruse?
Is it feasible to create automation based on the example (real-world use-case: automated apis) below?
Would I need an agent or tool for the scenarios below? it seems it could be more easily accomplished with a main script.

Sample Real-World Use-Case: Automated APIs - Assumes advanced GPT-like LLM

Prompt LLM at a high level "please write an API script in python for Gmail, that will send an email to John Smith using my gmail account at hahaha@gmail.com"
LLM to write API code in python/js/php, for GMAIL as an output, autofilling the API kwargs with to:john smith, using account hahaha@gmail.com.
Backend to execute LLM code.

Sample Real-World Use-Case: Automated APIs - Assumes low-powered LLMs in sequence

Prompt LLM at a high level "send an email to John Smith using my gmail account at hahaha@gmail.com"
blenderbot LLM to extract elements, being the application name Gmail and remaining sentence.
Backend to take LLM output for application and lookup to stored Gmail API SDK information, and retrieve stored SDK requirements
Backend to feed Gmail SDK requirements into starcoder LLM with original prompt and asked to fill the API kwargs with the data in the original prompt
starcoder LLM output generated script is executed

Apologies, but still struggling on a good use case for agents/tools that significantly reduces lines of code or does something I couldn't already do in a main function.

coyotte508 commented 9 months ago

cc @nsarrazin

julien-c commented 9 months ago

Before heading down my own "rabbit" hole

🫣

huggingface / huggingface.js