No models shown in the 'Choose a model' dropdown

JosephSBoyle commented 1 year ago

When I access the server from http://127.0.0.1:5000 I can see all the models in the choose a model dropdown.

When I access the server from another device on my network, e.g from http://192.168.0.8:5000, I cannot see any models in the dropdown.

However, when I access http://192.168.0.8:5000/list_models all the models are listed AND http://192.168.0.8:5000/load_model/4 (the model I want to use sends back a success response - but no model appears to be loaded...

RocaroDev93 commented 1 year ago

Alpaca Turbo is unable to load model on distant computer.

The problem is that the request IP address is set to "localhost" by default instead of the real current IP address of the alpaca server.

To solve the problem you can replace the following command const tt_socketUrl="http://localhost:5000" by const tt_socketUrl=window.location.href in the /templates/main.f729fcdb88c6ef0d.js file.

JosephSBoyle commented 1 year ago

Thanks @RocaroDev93, that fixed it.

Cabanera commented 1 year ago

@JosephSBoyle which model are you using ? Because it gets stuck on loading even in localhost

JosephSBoyle commented 1 year ago

@Cabanera try this 7B one: https://huggingface.co/Pi3141/alpaca-native-7B-ggml/blob/main/ggml-model-q4_0.bin

My understanding is that the upstream llama.cpp project changed their binary files and so some of the newer binaries don't work... This one works for me though:)

krravi55 commented 1 year ago

I have this message when running it,

Address already in use Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port.

How do I change that port to something else?

JosephSBoyle commented 1 year ago

I have this message when running it,

Address already in use Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port.

How do I change that port to something else?

Try restarting. If that doesn't work, search api.py for 5000 and replace it with something else. Bear in mind this will change the last part of the url you're accessing in your browser.

RocaroDev93 commented 1 year ago

@krravi55 To run the server with a different port, just duplicate the file api.py and rename it app.py. Then you can run the server with flask command:

Copy and rename api.py file as app.py
Open a terminal
Activate conda env with command conda activate alpaca_turbo
Run the server using the command flask run --port=<YOUR-PORT-NUMBER> → Ex: flask run --port=8080

RocaroDev93 commented 1 year ago

I have this message when running it, Address already in use Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port. How do I change that port to something else?

Try restarting. If that doesn't work, search api.py for 5000 and replace it with something else. Bear in mind this will change the last part of the url you're accessing in your browser.

I tried to set the parameter port of the function run in the api.py file but the server wasn't reachable on the selected port. The only solution I used that works was running the server with the command flask and setting the port with the command. But running a Flask server requires a app.py file in the directory. The api.py file is a app.py but the developper change its name. So renaming this file make the Flask server runable with the command flask

krravi55 commented 1 year ago

Running the flask command worked! I tried searching for all instanced of "5000" in the folder and changed the port which were in two files and a javascript file but yet it somehow stuck to 5000. Not sure where that is coming from.

JosephSBoyle commented 1 year ago

@ViperX7, would you accept a PR with this change:

To solve the problem you can replace the following command const tt_socketUrl="http://localhost:5000" by const tt_socketUrl=window.location.href in the /templates/main.f729fcdb88c6ef0d.js file.

It isn't breaking and allows users to run inference from other devices.

krravi55 commented 1 year ago

So you are removing a hardcoded URL value with getting the URL of the current window. But, still how do you ask flask to run it from a particular port?

RocaroDev93 commented 1 year ago

The hardcoded "localhost" doesn't change anything for the port problem.

If you want to ask the Flask serve being run wirh a different port use the following command:

Flask command: flask run --port 8080

For example if you want to run it on the 8080 port.

But before using this command, you need to activate the alpaca_turbo conda venv created when you followed the repo tutorial.

And also, the Flask server need to have a app.py file in the directory but it doesn't exist. The project developper have created the required file but the name was not app.py but api.py. If you rename the api.py to app.py, you'll be able to run the Flask command and so run the server on a different port.

But instead of renaming the api.py file to app.py, I suggest you to create a copy of the api.py file and rename the copy file to app.py. So you keep the project structure and you keep all other features that are dependent to the project structure like running the project on a Docker.

krravi55 commented 1 year ago

I got the site to load. Now, I have this

n/chat/gpt4all-lora-quantized-OSX-m1 ; exit; main: seed = 1680883907 llama_model_load: loading model from 'gpt4all-lora-quantized.bin' - please wait ... llama_model_load: failed to open 'gpt4all-lora-quantized.bin' main: failed to load model from 'gpt4all-lora-quantized.bin'

Saving session... ...copying shared history... ...saving history...truncating history files... ...completed.

aalbrightpdx commented 1 year ago

For anyone else still having the problem where the models won't ~~load~~ appear in the drop-down; the problem is that the models .bin file is missing from the /models/ directory. Goto the models dir, then wget https://huggingface.co/Pi3141/alpaca-native-7B-ggml/blob/main/ggml-model-q4_0.bin, then goto the primary dir, and run docker-compose up.

Then goto localhost:5000, the ggml-model should appear in the dropdown (or whatever model you put in the models directory), click change and hope it finishes loading.

ViperX7 commented 1 year ago

@aalbrightpdx the model that you linked wont work checkout the anouncement channel on discord for the list of supported models

ViperX7 commented 1 year ago

@ViperX7, would you accept a PR with this change:

To solve the problem you can replace the following command const tt_socketUrl="http://localhost:5000" by const tt_socketUrl=window.location.href in the /templates/main.f729fcdb88c6ef0d.js file.

It isn't breaking and allows users to run inference from other devices.

I would love to but please wait for the next release should be on monday then you can add a pr if required

aalbrightpdx commented 1 year ago

If I followed the directions in the windows installer video (I'm using Linux by the way so this was slightly confusing), then the correct steps would be:

Goto the /models/ directory
wget https://huggingface.co/Pi3141/alpaca-7b-native-enhanced/resolve/main/ggml-model-q4_1.bin
goto the primary directory
run docker-compose up
goto https://localhost:5000
click the drop down, select the model
click change
hope that the loading completes

During the video, ViperX7 specifically states that the import part is the Pi3141, therefore I would assume that all of the models in the https://huggingface.co/Pi3141/ url would probably also work, assuming your system can support them.

krravi55 commented 1 year ago

Thank you guys. It was the problem with not downloading the correct model. Now it's working. I am new to this am just exploring LLM's.

What is the advanced mode in the interface?

FrostKiwi commented 1 year ago

window.location.href

New v0.6 update.

This is still the valid solution for me. Though the name of the and format of the variable changed, it is still the second SocketURL that you have to change. Search for 7887, the new port and replace the second instance. However, the prompts don't return an answer for some reason. As seen from the debug screen the prompt is input, but processing does not start. nt={production:!1,apiUrl:window.location.href,socketUrl:window.location.href}

ViperX7 / Alpaca-Turbo

No models shown in the 'Choose a model' dropdown #41