Closed swtb3 closed 1 month ago
cc @aymeric-roucher
I also have this request
@gxcuit you can view my implementation here:
https://github.com/swtb3/math_agent_demo
It's not the cleanest, but it does work for local inference with a hf pipeline. I also tried to add support for ollama. However I found that these ollama models were unable to properly leverage the agent react framework. And often they failed to perform their role.
The HF pipeline worked really well though.
Thank you for this feature request @swtb3 ! In the PR #33218 that I've merged above, your point should be addressed: now you can just initialize a TransformersEngine
with your custom transformers pipeline
! cc @gxcuit
Feature request
The provided HfEngine class uses the inference API under the hood, this makes using agents simple. However, it would be good to support local inference in a similarly simple way.
If this is already supported through existing local inference pipelines (text-generation pipeline) then documentation should disambiguate the standard approach to local inference with agents.
Motivation
It was unclear when following the tutorial how to approach local inference with agents as it uses inference API by default.
Your contribution
Happy to make these changes, would need some support on the approach.
Update
I have had some success replacing the client with a pipeline, though this is a bit messy and took a fair bit of trial and error.
I'm so far unable to get the working local agent to work with the gradio chat interface.