Implement Local AI Integration with Ollama for Offline AI Assistance

ThalesAugusto0 commented 2 months ago

Description: We need to enhance Cursor IDE by implementing support for local AI models using Ollama, similar to the Continue extension for VS Code. This will enable developers to use AI-powered code assistance offline, ensuring privacy and reducing dependency on external APIs.

1. Ollama Integration:

Integrate Ollama into the Cursor IDE to run AI models locally. This should include the ability to configure the path to Ollama and select specific models for different coding tasks.
Ensure that the local AI models are available for features like code completion, refactoring, and contextual code understanding.

2. Model Support:
Provide compatibility with a range of models available through Ollama, such as Llama 3 and Starcoder 2, which offer support for Fill-the-Middle (FIM) predictions and embeddings.
Allow users to customize which models are used for specific tasks (e.g., code completion, embedding generation). Configuration Options:

Add options in Cursor’s settings to configure and manage local AI models. This should include the ability to switch between different AI providers, like Ollama and any cloud-based alternatives. Implement a configuration UI that allows users to easily select and manage their local AI setups. Performance and Usability:

Optimize the interaction between Cursor and the local AI models to minimize latency and resource usage. Ensure that the local AI features are as seamless and user-friendly as their cloud-based counterparts, with clear feedback on model performance and any potential issues.

Air1996 commented 2 months ago

Considering the size and effectiveness of the local model and the commercialization of the Cursor product, the likelihood of this proposal coming to fruition is quite small. 😂

ThalesAugusto0 commented 2 months ago

Considering the size and effectiveness of the local model and the commercialization of the Cursor product, the likelihood of this proposal coming to fruition is quite small. 😂

The company doesn't need to do this, since the code is open, why can't we, the development community, do this?

vertis commented 2 months ago

Cursor is not open source. This is an issues only repo.

tcsenpai commented 2 months ago

Upping this anyway. The company can still monetize with the thousands of devs that do not have a powerful GPU.

Mateleo commented 2 months ago

Check this : https://github.com/getcursor/cursor/issues/1380#issuecomment-2371534354

sneedger commented 1 month ago

Check this : #1380 (comment)

The devs broke that as well (likely on purpose), they're in for the money and don't care about you and me

Mateleo commented 1 month ago

Check this : #1380 (comment)

The devs broke that as well (likely on purpose), they're in for the money and don't care about you and me

For me it's working perfectly fine, using ollama + ngrok. I use the latest version of cursor

tcsenpai commented 1 month ago

Check this : #1380 (comment)

The devs broke that as well (likely on purpose), they're in for the money and don't care about you and me

I doubt the devs would do such an easily visible thing like breaking support specifically for ollama (as here we are anyway using an OpenAI endpoint, so is pretty generic). Anyway, the loss of quality using 8b models this way is not worth saving 20 bucks per month. They are not in danger.

sneedger commented 1 month ago

Check this : #1380 (comment)

The devs broke that as well (likely on purpose), they're in for the money and don't care about you and me

For me it's working perfectly fine, using ollama + ngrok. I use the latest version of cursor

I applied your workaround properly but I keep on getting error 403 from ngrok like many other people, do I need to forward some port or?

astr0gator commented 1 week ago

Hi team, qwen 2.5-coder 32B is here and it's rad. Can anybody give hopes re implementing of ollama support? Frankly, it's game-changing/deal-breaking for many including me. I found 7 open issues about implementing local llm in this repo

Thanks! 🙏🙏🙏

loktar00 commented 1 week ago

Hi team, qwen 2.5-coder 32B is here and it's rad. Can anybody give hopes re implementing of ollama support? Frankly, it's game-changing/deal-breaking for many including me. I found 7 open issues about implementing local llm in this repo

Thanks! 🙏🙏🙏

It works fine for me, just use ngrok and point it to your ollama ip, for example mine runs on my network at 192.168.1.3:1147 so I point ngrok there on that same computer.

On my cursor installation under cursor settings I turn off all other models, select use custom openapi endpoint and throw a random api key in and set the openapi url to

<ngrokurl>/v1

Here's a screenshot of my setup

I'm also on windows with both boxes. What I'm trying to figure out honestly is ensuring I'm using maximum context, ollama has a nonstandard api for it.

diadras commented 5 days ago

Hi team, qwen 2.5-coder 32B is here and it's rad. Can anybody give hopes re implementing of ollama support? Frankly, it's game-changing/deal-breaking for many including me. I found 7 open issues about implementing local llm in this repo Thanks! 🙏🙏🙏

It works fine for me, just use ngrok and point it to your ollama ip, for example mine runs on my network at 192.168.1.3:1147 so I point ngrok there on that same computer.

On my cursor installation under cursor settings I turn off all other models, select use custom openapi endpoint and throw a random api key in and set the openapi url to

<ngrokurl>/v1

Here's a screenshot of my setup [image]

I'm also on windows with both boxes. What I'm trying to figure out honestly is ensuring I'm using maximum context, ollama has a nonstandard api for it.

I am unable to get it to work.

I disabled all the other models so it would authenticate my qwen model when I would click "Verify".

I then see it try to verify and fail. Cursor asks me to run curl https://[MY_HASH].ngrok-free.app/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer [AUTH_TOKEN]" -d '{ "messages": [ { "role": "system", "content": "You are a test assistant." }, { "role": "user", "content": "Testing. Just say hi and nothing else." } ], "model": "qwen2.5-coder:14b" }'.

I do this and I receive an HTML response back from Ngrok with the following HTML body; `

`

I see 10:38:29.576 CET OPTIONS /v1/chat/completions 403 Forbidden in my Ngrok console ( After all the updates I was never able to get Ngrok working. It seemed to simply block all my requests and since I want to connect to it locally anyway I focussed on that)

Update: After taking a look at the Ollama API docs it seems that the OPTIONS /chat/completions request is not compatible with the POST /api/chat Ollama endpoint

Update v2: I am able to get it to work in the console but not with the OpenAI API layout

Update v3: According to the Ollama OpenAI compatibility blog you can enable the OpenAI API schema by using /v1 at the end of your URL. I did this and it WORKS in my console, but Cursor throws an error and shows me a curl command that works in my console.

> curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer 123" -d '{
  "messages": [
    {
      "role": "system",
      "content": "You are a test assistant."
    },
    {
      "role": "user",
      "content": "Testing. Just say hi and nothing else."
    }
  ],
  "model": "qwen2.5-coder:14b"
}'
{"id":"chatcmpl-821","object":"chat.completion","created":1732011716,"model":"qwen2.5-coder:14b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"hi"},"finish_reason":"stop"}],"usage":{"prompt_tokens":28,"completion_tokens":2,"total_tokens":30}}

Cursor will not accept this though and keep saying it is an invalid model and my API key doesn't support it. The Continue.dev extension works like plug and play after I edited the config.json file and simply pointed it to qwen

loktar00 commented 4 days ago

Hi team, qwen 2.5-coder 32B is here and it's rad. Can anybody give hopes re implementing of ollama support? Frankly, it's game-changing/deal-breaking for many including me. I found 7 open issues about implementing local llm in this repo Thanks! 🙏🙏🙏

It works fine for me, just use ngrok and point it to your ollama ip, for example mine runs on my network at 192.168.1.3:1147 so I point ngrok there on that same computer. On my cursor installation under cursor settings I turn off all other models, select use custom openapi endpoint and throw a random api key in and set the openapi url to <ngrokurl>/v1 Here's a screenshot of my setup [image] I'm also on windows with both boxes. What I'm trying to figure out honestly is ensuring I'm using maximum context, ollama has a nonstandard api for it.

I am unable to get it to work.

I disabled all the other models so it would authenticate my qwen model when I would click "Verify".

I then see it try to verify and fail. Cursor asks me to run curl https://[MY_HASH].ngrok-free.app/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer [AUTH_TOKEN]" -d '{ "messages": [ { "role": "system", "content": "You are a test assistant." }, { "role": "user", "content": "Testing. Just say hi and nothing else." } ], "model": "qwen2.5-coder:14b" }'.

I do this and I receive an HTML response back from Ngrok with the following HTML body; `

I see10:38:29.576 CET OPTIONS /v1/chat/completions 403 Forbidden` in my Ngrok console ( After all the updates I was never able to get Ngrok working. It seemed to simply block all my requests and since I want to connect to it locally anyway I focussed on that)

Update: After taking a look at the Ollama API docs it seems that the OPTIONS /chat/completions request is not compatible with the POST /api/chat Ollama endpoint

Update v2: I am able to get it to work in the console but not with the OpenAI API layout

Update v3: According to the Ollama OpenAI compatibility blog you can enable the OpenAI API schema by using /v1 at the end of your URL. I did this and it WORKS in my console, but Cursor throws an error and shows me a curl command that works in my console.
> curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer 123" -d '{
  "messages": [
    {
      "role": "system",
      "content": "You are a test assistant."
    },
    {
      "role": "user",
      "content": "Testing. Just say hi and nothing else."
    }
  ],
  "model": "qwen2.5-coder:14b"
}'
{"id":"chatcmpl-821","object":"chat.completion","created":1732011716,"model":"qwen2.5-coder:14b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"hi"},"finish_reason":"stop"}],"usage":{"prompt_tokens":28,"completion_tokens":2,"total_tokens":30}}
Cursor will not accept this though and keep saying it is an invalid model and my API key doesn't support it. The Continue.dev extension works like plug and play after I edited the config.json file and simply pointed it to qwen

Hmm can you share your URL setting, an image from your settings, if you check my image/description I'm mapping ngrok to my ip:port and using the url as the following within Cursor:

https://ngrokwahtever.com/v1

getcursor / cursor

Implement Local AI Integration with Ollama for Offline AI Assistance #1811