Consider a scalable LLM server approach

atisharma / chasm_engine

CHAracter State Management - a generative text adventure (engine)

https://chasm.run

GNU Affero General Public License v3.0

61 stars 5 forks source link

Closed atisharma closed 12 months ago

atisharma commented 1 year ago

Using tgwui is OK, but it may be better to send to a pool of LLM servers running, for instance, https://mlc.ai/mlc-llm/ and communicate over zmq.

St33lMouse commented 1 year ago

Why not just use the ooba api, or Kobold? All the backend stuff goes away.

atisharma commented 1 year ago

It does use ooba API, and can talk to multiple instances. But it does not know when a server is busy or not.

St33lMouse commented 1 year ago

Oh! Do you have install instructions somewhere?

atisharma commented 1 year ago

https://github.com/atisharma/chasm_engine#installing-and-running Detailed instructions depend on your setup. But that discussion doesn't belong in this issue, please open a new one.

atisharma commented 12 months ago

Added replicate support; also can use cloud OpenAI-compatible providers.