Blog: License to Call: Introducing Transformers Agents 2.0

KannamSridharKumar commented 2 months ago

The question is regarding - https://huggingface.co/blog/agents#self-correcting-retrieval-augmented-generation

The sources of docs in the vector index are ['blog', 'optimum', 'datasets-server', 'datasets', 'transformers', 'course', 'gradio', 'diffusers', 'evaluate', 'deep-rl-class', 'peft', 'hf-endpoints-documentation', 'pytorch-image-models', 'hub-docs']

For the query "Please show me a LORA finetuning script", the output shown in the blog shows that the agent first started with these 3 sources ['transformers', 'datasets-server', 'datasets'] and then used all sources as it couldn't find relevant context in those 3 sources.

How/why did the agent chose those 3 data sources? Were those sources picked by up the agent itself or were they set by the human?

Something doesn't look right here to me.

Thanks,

osanseviero commented 1 month ago

cc @aymeric-roucher

aymeric-roucher commented 1 month ago

@KannamSridharKumar the sources were selected by the agent! It won't reproduce the same everytime though, it depends on what the agent generates for its inner thoughts.

huggingface / blog

Blog: License to Call: Introducing Transformers Agents 2.0 #2351