containers / ramalama

The goal of RamaLama is to make working with AI boring.
MIT License
272 stars 47 forks source link

`ramalama serve` doesn't work when podman is installed #442

Closed grillo-delmal closed 1 week ago

grillo-delmal commented 1 week ago

What is the problem:

When running ramalama on a computer with podman installed, if you try to setup a server through the ramalama serve. The server will be inaccesible from the host computer.

How to reproduce

url = "http://localhost:8080/v1/chat/completions" headers = { "Content-Type": "application/json" } data = { "messages": [ {"role": "user", "content": "Hello, world!"} ] }


### What is happening?

First, here is the difference between running ramalama with and without podman installed.

Without podman installed

```sh
ramalama --dryrun serve tiny
> os.execvp(llama-server, ['llama-server', '--port', '8080', '-m', '/path/to/model'])

With podman installed

ramalama --dryrun serve tiny
> podman run --rm -i --label RAMALAMA ... quay.io/ramalama/ramalama:latest llama-server --port 8080 -m /path/to/mod

As it can be seen, both run the llama-server command without the --host parameter, this defaults llama-server to listen to 127.0.0.1, which makes sense on the first case, but makes it so that it fails to listen from requests from the host environment on the second case.

I was able to verify this by entering the container through podman exec and being successfully able to run the python script from inside the container.

grillo-delmal commented 1 week ago

I believe that this problem was a byproduct of 8ed6f48 Before this commit, ramalama ran inside the container and processed the aruments from the inside, so it was able to propperly evaluate if the llama-server command was going to run in a container since it was running inside the container

Here is the check: https://github.com/containers/ramalama/blob/main/ramalama/model.py#L320-L321

Here is how that check is being evaluated right now: https://github.com/containers/ramalama/blob/main/ramalama/common.py#L17-L21

Adding an extra evaluation to see if the script will run inside a container would help fix this issue. This is being done inside the exec_model_in_container method right now https://github.com/containers/ramalama/blob/main/ramalama/model.py#L219-L223

grillo-delmal commented 1 week ago

Fixed by merging #444 ^^