gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
33.18k stars 2.51k forks source link

Gradio interface always displays loading and cannot load normally. #7519

Open xh9487963 opened 7 months ago

xh9487963 commented 7 months ago

Describe the bug

I used SSH to connect to a remote server running LLaMa-Factory which used Gradio(3.50.2). The project runs on the server and is mapped to my local computer through VSCODE's port forwarding. It has been running normally for the past few days, but now it suddenly cannot be used. The interface keeps showing loading and an error is reported on the console. However, when I let someone log in to my server to perform the same operation, he could load out the interface normally, so I suspect it may have something to do with my browser and network. But this problem still exists when I change the network, browser and clear the browser cache.

微信图片_20240223151819 微信图片_20240223151439

Have you searched existing issues? 🔎

Reproduction

import gradio as gr

Screenshot

No response

Logs

No response

System Info

My gradio version is 3.50.2

Severity

I can work around it

arian81 commented 7 months ago

you should provide your entire code. Your code for reproduction is incomplete.

abidlabs commented 7 months ago

Hi @xh9487963 does restarting the Gradio app fix the problem? Its hard to say more without more details, unfortunately this error doesn't ring a bell and the Gradio app is also using an older, unsupported version of Gradio (3.50.2)

xh9487963 commented 7 months ago

you should provide your entire code. Your code for reproduction is incomplete. Sorry, the complete code is very complicated because it is from someone else's open source project.(https://github.com/hiyouga/LLaMA-Factory) I ran it according to his README.md.

xh9487963 commented 7 months ago

you should provide your entire code. Your code for reproduction is incomplete. Sorry, the complete code is very complicated because it is from someone else's open source project.(https://github.com/hiyouga/LLaMA-Factory) I ran it according to his README.md. No,the author of that project recommends using version 3.50.2, should I switch to another version and try again? To be honest, what confuses me the most is that it is OK to change another PC to connect to the server and run the exact same operation. The error I have here seems to be a network communication related error.

arian81 commented 7 months ago

Hi @xh9487963 does restarting the Gradio app fix the problem? Its hard to say more without more details, unfortunately this error doesn't ring a bell and the Gradio app is also using an older, unsupported version of Gradio (3.50.2)

I'm getting the same error on latest version of gradio

xh9487963 commented 7 months ago

Hi @xh9487963 does restarting the Gradio app fix the problem? Its hard to say more without more details, unfortunately this error doesn't ring a bell and the Gradio app is also using an older, unsupported version of Gradio (3.50.2)

I'm getting the same error on latest version of gradio

Did you resolve this question?

Harsh-raj commented 7 months ago

Hi I had the similar error as provided while trying to launch gradio interface to chat with phi-1_5 language model. I am using the code provided by the microsoft phi-1_5 repository and an additional of this inference.py script to deploy the gradio chat to interact with phi-1_5 model.

Hereh is the code

import gradio as gr 
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
from threading import Thread

# Loading the tokenizer and model from Hugging Face's model hub.
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1_5", torch_dtype="auto", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1_5",  trust_remote_code=True)

# using CUDA for an optimal experience
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)

# Defining a custom stopping criteria class for the model's text generation.
class StopOnTokens(StoppingCriteria):
    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        stop_ids = [2]  # IDs of tokens where the generation should stop.
        for stop_id in stop_ids:
            if input_ids[0][-1] == stop_id:  # Checking if the last generated token is a stop token.
                return True
        return False

# Function to generate model predictions.
def predict(message, history):
    history_transformer_format = history + [[message, ""]]
    stop = StopOnTokens()

    # Formatting the input for the model.
    messages = "</s>".join(["</s>".join(["\n<|user|>:" + item[0], "\n<|assistant|>:" + item[1]])
                       for item in history_transformer_format])

    model_inputs = tokenizer([messages], return_tensors="pt").to(device)
    streamer = TextIteratorStreamer(tokenizer, timeout=10., skip_prompt=True, skip_special_tokens=True)
    generate_kwargs = dict(
        model_inputs,
        streamer=streamer,
        max_new_tokens=1024,
        do_sample=True,
        top_p=0.95,
        top_k=50,
        temperature=0.7,
        num_beams=1,
        stopping_criteria=StoppingCriteriaList([stop])
    )
    t = Thread(target=model.generate, kwargs=generate_kwargs)
    t.start()  # Starting the generation in a separate thread.
    partial_message = ""
    for new_token in streamer:
        partial_message += new_token
        if '</s>' in partial_message:  # Breaking the loop if the stop token is generated.
            break
        yield partial_message

# Setting up the Gradio chat interface.
g = gr.ChatInterface(predict,
                 title="Phi1.5",
                 )
g.launch()  # Launching the web interface.

here is the terminal output:

/home/phi/.local/lib/python3.8/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
Running on local URL:  http://127.0.0.1:7860
Screenshot 2024-03-08 at 2 30 52 AM Screenshot 2024-03-08 at 2 34 01 AM

Please help, I was able to chat with TinyLLaMA with this code without any error but somehow when i change the model to Phi-1_5 it throws error

xudong2019 commented 6 months ago

I have the same issue

Vladimir-125 commented 6 months ago

I have a similar problem. It works when you access the port locally on the same host where you have started the app. However, if I forward the port or even try to access the host using the host's LAN IP like http://192.168.0.3:7860 it does not work. For me, it opens the main page, but when I try to upload a file it freezes. If you open developing tools in your browser and check the network tab, you can see it tries to make a POST request to upload a file, but it does not receive a response from the server, and then errors out. I used to have the same issue with other Gradio apps before when I was not able to access host's port or forward the port from VScode or over SSH port forwarding.

Here is the sample code: Example 1

import gradio as gr

def register_model(model_name, local_file_path):
    print(model_name)
    print(local_file_path)

    return 'ok'

# Define the Gradio app
demo = gr.Interface(
    fn=register_model,
    inputs=[
        gr.Textbox(label="Model Name", placeholder="gradio_model"),
        gr.File(label="Local File Path", file_types=['.onnx', '.pth', '.pt'])
    ],
    outputs="text",
    title="Model Registration",
    description="Enter the Model Name and Local File Path to register your model."
)
demo.launch(server_name='0.0.0.0')

Example 2

import gradio as gr 

with gr.Blocks() as demo:
    gr.File()

demo.launch()

demo codes freeze with:

gradio==4.24.0
gradio_client==0.14.0

and python==3.11.5 but I believe the problem is somewhere with how FastAPI or whatever is used as the backend handles where the traffic is coming from. Hope it helps to locate the problem.

luvwinnie commented 6 months ago

Is this being fixed?

hbx233 commented 3 months ago

Having the same issue, is someone looking at this?

rohitnanda1443 commented 1 month ago

Hi, off late I am also having exactly the same issue.

1) Took a fresh server on Vast.ai (It used to work a couple of weeks back) 2) Installed H2O-GPT (which is built on Gradio) 3) Used ngrok to forward the port 7860

Stuck on exactly the same screen.