nvms / wingman

Your pair programming wingman. Supports OpenAI, Anthropic, or any LLM on your local inference server.
https://marketplace.visualstudio.com/items?itemName=nvms.ai-wingman
ISC License
61 stars 10 forks source link

Wingman v2.0.8 , Local model is not supporting #27

Closed NK-Aero closed 8 months ago

NK-Aero commented 8 months ago

Wingman v2.0.8 , Local model is not supporting. LM Studio, Koboldcpp URL are not working, please provide documentation or tutorial or video about enabling local models. Thank you

nvms commented 8 months ago

I'm unfamiliar with LM Studio, but I just downloaded it and started a local inference server running llama 2 chat 7B q4_0 ggml. Using the OpenAI provider in Wingman, I changed the URL to http://localhost:1234/v1/chat/completions and things seem to just work (with the exception of recognizing the end of the completion stream, which may just be unique to LM Studio, but a bug none-the-less).

Can you provide some more information regarding your setup?

KoboldCpp support is almost finished - still porting this functionality from the previous major version.

Screenshot 2023-12-04 at 2 01 28 PM

NK-Aero commented 8 months ago

Thank you , it is working now. i was using "http://localhost:1234/v1". now i changed to "http://localhost:1234/v1/chat/completions" . it is working.

I was using Wingman 1.3.8 preview, it was excellent. Now, it upgraded well. Thank you.

nvms commented 8 months ago

with the exception of recognizing the end of the completion stream, which may just be unique to LM Studio, but a bug none-the-less

This is fixed now and the completion stream resoled by LM Studio should correctly end the response. Pushing a release with this bug fix now.

nvms commented 8 months ago

Thank you , it is working now. i was using "http://localhost:1234/v1". now i changed to "http://localhost:1234/v1/chat/completions" . it is working.

I was using Wingman 1.3.8 preview, it was excellent. Now, it upgraded well. Thank you.

Excellent! Glad it's working. Have fun!