sagemathinc / cocalc-compute-docker

Docker image for adding remote compute capabilities to a CoCalc project.
Other
2 stars 1 forks source link

webui/ollama: progress regarding api proxying #5

Closed haraldschilly closed 7 months ago

haraldschilly commented 7 months ago

There is "progress" regarding accessing the ollama api endpoint, but no success yet. At least, I don't know how to succeed.

  1. The proxy config is certainly wrong. I changed the json file and restarted the proxy service. The file should be more like this:
$ cat /app/proxy.json 
[
        { "path": "/ollama-api", "target": "http://localhost:11434/api" },
        { "path": "/", "target": "http://localhost:3000", "ws": true }
]

Ordering changed, and I added ollama- because the webui also uses its own /api.

  1. With that, I can issue a GET request from my local computer like this:
$ curl -k https://35.[xx.yy.zz]/ollama-api/tags?auth_token=[SECRET]
{"models":[{"name":"gemma:latest","model":"gemma:latest","modified_at":...

That endpoint should list all available models and their capabilities and indeed, I get the information. Great!

What fails is doing an actual query using a POST request:

$ curl -k https://35.XX.YYY.ZZZ/ollama-api/generate?auth_token=[SECRET] -d '{ "model" : "gemma" , "stream": false, "prompt" : "what comes after fruit, apple, banana?" } ' 
Found. Redirecting to 

and when I add -L to follow redirects, I end up with HTML asking for <p>Please enter the authentication token to proceed:</p>

I also tried setting that AUTH_TOKEN=... cookie directly via curl, but that just hangs and somehow crashes ollama (to get it working again, I'm restarting the ollama docker container)

haraldschilly commented 7 months ago

For comparison, with an ssh tunnel, it works

$ curl -s http://localhost:11434/api/generate -d '{ "model" : "gemma" , "stream": false, "prompt" : "what comes after horse, car, and airplane?" } ' | jq .response
"The answer is rocket.\n\nHorse, car, and airplane are all vehicles that are used on land, while rocket is a vehicle that is used in space."
williamstein commented 7 months ago

What fails is doing an actual query using a POST request:

I just looked at the code and can see exactly why this doesn't work. The code assumes that if the request is a POST request and the cookie is not set, then user is making a post request from the sign in page, so it just redirects them to the page they really wanted to go to. Thus this:

$ curl -k https://35.XX.YYY.ZZZ/ollama-api/generate?auth_token=[SECRET] -d '{ "model" : "gemma" , "stream": false, "prompt" : "what comes after fruit, apple, banana?" } ' 
Found. Redirecting to 

does not proxy on the POST, and instead does a redirect. It even says that above in "Redirecting to".

I'll plan to do two things:

  1. ONLY do the redirect if the post request explicitly sets the returnTo field, which is what the login does. Since you are not setting that, your post request should start working.
  2. Obviously, for api use, it would be good to pass the auth_token in a standard way for api auth, instead of as a query param.
haraldschilly commented 7 months ago

After figuring out that the code was still an old proxy and I managed to update it, now it looks like this. (i.e. there is no req.body)

$ curl -k https://$IP/ollama-api/generate?auth_token=$TOKEN -d '{ "model" : "gemma", "stream": false, "prompt": "what comes after writing, reading, learning?" }'
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>TypeError: Cannot read properties of undefined (reading &#39;cocalcReturnTo_0.7010720816630611&#39;)<br> &nbsp; &nbsp;at handle (/opt/proxy/nvm/versions/node/v20.11.1/lib/node_modules/@cocalc/compute-server-proxy/dist/lib/auth.js:55:54)<br> &nbsp; &nbsp;at Layer.handle [as handle_request] 
[...]
haraldschilly commented 7 months ago

Ok, I got it to work after two changes:

sudo vim  /opt/proxy/nvm/versions/node/v20.11.1/lib/node_modules/@cocalc/compute-server-proxy/dist/lib/auth.js

and edited in line 55 and 72: req.body[...req.body?.[COCALC_AUTH_RETURN_TO]

haraldschilly commented 7 months ago

:tada: ... this now works on my local computer, where IP and TOKEN are just the two variables for the server

$ curl -s -k https://$IP/ollama-api/generate?auth_token=$TOKEN -d '{ "model" : "gemma", "stream": false, "prompt": "Explain iterators in python to me", "system": "be brief"}' | jq -r .response
Iterators are objects in Python that allow you to iterate over a sequence of items one item at a time. They are lazily evaluated, meaning that they don't store all the items in memory at once, but instead generate them on demand when you need them.

Here are some key concepts related to iterators:

**1. Iterables:**
- Iterables are objects that can be iterated over, such as lists, sets, dictionaries, and strings.

**2. Iterators:**
- Iterators are objects that can be used to iterate over an iterable.

[...]