Open srikant86panda opened 1 year ago
cc @fozziethebeat
Oof, i haven't tried the gradio web server, does that load up a worker automatically within itself?
Doing some quick code reading, I think someone more familiar with the gradio server will need to suggest a fix. I'm not entirely clear how this works for local models in general.
Key code snippets I see are:
def get_model_list(controller_url, add_chatgpt, add_claude, add_palm):
ret = requests.post(controller_url + "/refresh_all_workers")
assert ret.status_code == 200
ret = requests.post(controller_url + "/list_models")
models = ret.json()["models"]
# Add API providers
if add_chatgpt:
models += ["gpt-3.5-turbo", "gpt-4"]
if add_claude:
models += ["claude-v1", "claude-instant-v1"]
if add_palm:
models += ["palm-2"]
priority = {k: f"___{i:02d}" for i, k in enumerate(model_info)}
models.sort(key=lambda x: priority.get(x, x))
logger.info(f"Models: {models}")
return models
And then
def add_text(state, model_selector, text, request: gr.Request):
ip = request.client.host
logger.info(f"add_text. ip: {ip}. len: {len(text)}")
if state is None:
state = State(model_selector)
And finally
class State:
def __init__(self, model_name):
self.conv = get_conversation_template(model_name)
self.conv_id = uuid.uuid4().hex
self.skip_next = False
self.model_name = model_name
So my guess is that gradio is fetching only model names from the worker/controller and then using that to populate the drop down selector.
Line https://github.com/lm-sys/FastChat/blob/main/fastchat/model/model_adapter.py#L325 receives model name instead of the complete or relative path when loading PEFT weight using PEFT adapter with gradio_web_server. Code:
Output error:
Model loading fails as a result, which causes a Web UI error.