Open xh9487963 opened 7 months ago
you should provide your entire code. Your code for reproduction is incomplete.
Hi @xh9487963 does restarting the Gradio app fix the problem? Its hard to say more without more details, unfortunately this error doesn't ring a bell and the Gradio app is also using an older, unsupported version of Gradio (3.50.2)
you should provide your entire code. Your code for reproduction is incomplete. Sorry, the complete code is very complicated because it is from someone else's open source project.(https://github.com/hiyouga/LLaMA-Factory) I ran it according to his README.md.
you should provide your entire code. Your code for reproduction is incomplete. Sorry, the complete code is very complicated because it is from someone else's open source project.(https://github.com/hiyouga/LLaMA-Factory) I ran it according to his README.md. No,the author of that project recommends using version 3.50.2, should I switch to another version and try again? To be honest, what confuses me the most is that it is OK to change another PC to connect to the server and run the exact same operation. The error I have here seems to be a network communication related error.
Hi @xh9487963 does restarting the Gradio app fix the problem? Its hard to say more without more details, unfortunately this error doesn't ring a bell and the Gradio app is also using an older, unsupported version of Gradio (3.50.2)
I'm getting the same error on latest version of gradio
Hi @xh9487963 does restarting the Gradio app fix the problem? Its hard to say more without more details, unfortunately this error doesn't ring a bell and the Gradio app is also using an older, unsupported version of Gradio (3.50.2)
I'm getting the same error on latest version of gradio
Did you resolve this question?
Hi I had the similar error as provided while trying to launch gradio interface to chat with phi-1_5 language model. I am using the code provided by the microsoft phi-1_5 repository and an additional of this inference.py script to deploy the gradio chat to interact with phi-1_5 model.
Hereh is the code
import gradio as gr
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
from threading import Thread
# Loading the tokenizer and model from Hugging Face's model hub.
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1_5", torch_dtype="auto", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1_5", trust_remote_code=True)
# using CUDA for an optimal experience
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
# Defining a custom stopping criteria class for the model's text generation.
class StopOnTokens(StoppingCriteria):
def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
stop_ids = [2] # IDs of tokens where the generation should stop.
for stop_id in stop_ids:
if input_ids[0][-1] == stop_id: # Checking if the last generated token is a stop token.
return True
return False
# Function to generate model predictions.
def predict(message, history):
history_transformer_format = history + [[message, ""]]
stop = StopOnTokens()
# Formatting the input for the model.
messages = "</s>".join(["</s>".join(["\n<|user|>:" + item[0], "\n<|assistant|>:" + item[1]])
for item in history_transformer_format])
model_inputs = tokenizer([messages], return_tensors="pt").to(device)
streamer = TextIteratorStreamer(tokenizer, timeout=10., skip_prompt=True, skip_special_tokens=True)
generate_kwargs = dict(
model_inputs,
streamer=streamer,
max_new_tokens=1024,
do_sample=True,
top_p=0.95,
top_k=50,
temperature=0.7,
num_beams=1,
stopping_criteria=StoppingCriteriaList([stop])
)
t = Thread(target=model.generate, kwargs=generate_kwargs)
t.start() # Starting the generation in a separate thread.
partial_message = ""
for new_token in streamer:
partial_message += new_token
if '</s>' in partial_message: # Breaking the loop if the stop token is generated.
break
yield partial_message
# Setting up the Gradio chat interface.
g = gr.ChatInterface(predict,
title="Phi1.5",
)
g.launch() # Launching the web interface.
here is the terminal output:
/home/phi/.local/lib/python3.8/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
Running on local URL: http://127.0.0.1:7860
Please help, I was able to chat with TinyLLaMA with this code without any error but somehow when i change the model to Phi-1_5 it throws error
I have the same issue
I have a similar problem. It works when you access the port locally on the same host where you have started the app. However, if I forward the port or even try to access the host using the host's LAN IP like http://192.168.0.3:7860
it does not work.
For me, it opens the main page, but when I try to upload a file it freezes. If you open developing tools in your browser and check the network
tab, you can see it tries to make a POST request to upload a file, but it does not receive a response from the server, and then errors out.
I used to have the same issue with other Gradio apps before when I was not able to access host's port or forward the port from VScode or over SSH port forwarding.
Here is the sample code: Example 1
import gradio as gr
def register_model(model_name, local_file_path):
print(model_name)
print(local_file_path)
return 'ok'
# Define the Gradio app
demo = gr.Interface(
fn=register_model,
inputs=[
gr.Textbox(label="Model Name", placeholder="gradio_model"),
gr.File(label="Local File Path", file_types=['.onnx', '.pth', '.pt'])
],
outputs="text",
title="Model Registration",
description="Enter the Model Name and Local File Path to register your model."
)
demo.launch(server_name='0.0.0.0')
Example 2
import gradio as gr
with gr.Blocks() as demo:
gr.File()
demo.launch()
demo codes freeze with:
gradio==4.24.0
gradio_client==0.14.0
and python==3.11.5 but I believe the problem is somewhere with how FastAPI or whatever is used as the backend handles where the traffic is coming from. Hope it helps to locate the problem.
Is this being fixed?
Having the same issue, is someone looking at this?
Hi, off late I am also having exactly the same issue.
1) Took a fresh server on Vast.ai (It used to work a couple of weeks back) 2) Installed H2O-GPT (which is built on Gradio) 3) Used ngrok to forward the port 7860
Stuck on exactly the same screen.
Describe the bug
I used SSH to connect to a remote server running LLaMa-Factory which used Gradio(3.50.2). The project runs on the server and is mapped to my local computer through VSCODE's port forwarding. It has been running normally for the past few days, but now it suddenly cannot be used. The interface keeps showing loading and an error is reported on the console. However, when I let someone log in to my server to perform the same operation, he could load out the interface normally, so I suspect it may have something to do with my browser and network. But this problem still exists when I change the network, browser and clear the browser cache.
Have you searched existing issues? 🔎
Reproduction
Screenshot
No response
Logs
No response
System Info
Severity
I can work around it