When I Launched a gradio web server, I could open my browser and chat with a model. However, the answers of the model are garbled code.
How can I fix this problem? There is no error information.
I'm trying to run the LLaVA on two RTX 4090 GPUs for inference. The model loads onto the GPUs without any issues, but an error occurs at inference time when I run the sample example from the Gradio we…