bdambrosio / AllTheWorldAPlay

All the world is a play, we are but actors in it.
46 stars 5 forks source link

llama.cpp server support #5

Open lastrosade opened 5 months ago

lastrosade commented 5 months ago

Will you consider supporting the llama.cpp server API for inference?

bdambrosio commented 5 months ago

Sure. I'll try to get it running today. s/b easy. I added Anthropic as a worldsim.main init var, I'll do the same. pbly add to beta. Can pbly backport to main, might take a bit to get to that. I can pbly merge beta at the same time, it's looking pretty solid.

worldsim.main(W)

worldsim.main(W, server='Claude') # yup, Claude is supported. But RUN LOCAL OSS if you can!

lastrosade commented 5 months ago

Oh wow, thanks a ton!

bdambrosio commented 5 months ago

Well, initial round trip was trivial, but text quality isn't good. I'll keep looking into it later today. I suspect it is a problem with chat-template formatting.

bdambrosio commented 5 months ago

Pushed a commit with some llama.cpp code. Works, but at least w LLama3-8B text quality is poor. Ideas welcome. I'm doing something stupid.

lastrosade commented 5 months ago

I still have not gotten around to trying it, It seems I have a bit more setting up to do then spinning a llama.cpp server

bdambrosio commented 5 months ago

Sorry. If/when I get this performing well enough I'll spend more time packaging for a simpler install.