oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
40.79k stars 5.33k forks source link

Openai Batched generation #3296

Closed Zoraman26 closed 1 year ago

Zoraman26 commented 1 year ago

Description

Batched API openai extension

image

matatonic commented 1 year ago

On my radar, I'll bump it up the priority list.

Zoraman26 commented 1 year ago

On my radar, I'll bump it up the priority list.

Thanks a ton!

matatonic commented 1 year ago

On my radar, I'll bump it up the priority list.

Thanks a ton!

You can try it out in the PR here: https://github.com/oobabooga/text-generation-webui/pull/3309

Zoraman26 commented 1 year ago

On my radar, I'll bump it up the priority list.

Thanks a ton!

You can try it out in the PR here: #3309

Seems to work, new error unrelated to api support pretty sure its because of the way OpenDex works, my AI server isn't poweful enough to run models with a larger context size

Any models that have ~3500 context size and can be run in 12gb vram or 64gb main pool ram?

image

Using this repo btw https://github.com/kry0sc0pic/OpenDex

matatonic commented 1 year ago

On my radar, I'll bump it up the priority list.

Thanks a ton!

You can try it out in the PR here: #3309

Seems to work, new error unrelated to api support pretty sure its because of the way OpenDex works, my AI server isn't poweful enough to run models with a larger context size

Any models that have ~3500 context size and can be run in 12gb vram or 64gb main pool ram?

The 13B llama-2 based (l2) models have a 4K context, and should just fit in around 12G. TheBloke has a few already.

Matthew.