RussellCanfield / wingman-ai

An open source AI assistant VSCode extension. Works with Ollama, HuggingFace and OpenAI
MIT License
128 stars 10 forks source link

Wingman - AI Coding Assistant

The Wingman-AI extension brings high quality AI assisted coding right to your computer, it's 100% free and private which means data never leaves your machine!

🚀 Getting Started

Choosing an AI Provider

We recommend starting with Ollama using the Deepseek model(s), see why here or here.

That's it! This extension will validate that the models are configured correctly in it's VSCode settings upon launch. If you wish to customize which models run see the FAQ section.

Features

Code Completion

The AI will look for natural pauses in typing to decide when to offer code suggestions (keep in mind the speed is limited by your machine). The code completion feature will also analyze comments you type and generate suggestions based on that context.

Wingman AI code completion example

Code Completion Disable / HotKey

We understand that sometimes the code completion feature can be too aggressive, which may strain your system's resources during local development. To address this, we have introduced an option to disable automatic code completion. However, we also recognize the usefulness of on-demand completion. Therefore, we've implemented a hotkey that allows you to manually trigger code completion at your convenience.

When you need assistance, simply press Shift + Ctrl + Space. This will bring up a code completion preview right in the editor and a quick action will appear. If you're satisfied with the suggested code, you can accept it by pressing Enter. This provides you with the flexibility to use code completion only when you want it, without the overhead of automatic triggers.

Interactive Chat

Talk to the AI naturally! It will use open files as context to answer your question, or simply select a section of code to use as context. Chat will also analyze comments you type and ge

Wingman AI chat example

Wingman AI chat example

AI Providers

Ollama

Ollama is a free and open-source AI model provider, allowing users to run their own local models.

Why Ollama?

Ollama was chosen for it's simplicity, allowing users to pull a number of models in different configurations and update them at will. Ollama will pull optimized models based on your system architecture, however if you do not have a GPU accelerated machine, models will be slower.

Setting up Ollama

Follow the directions on the Ollama website. Ollama has a number of open source models available that are capable of writing high quality code. See getting started for how to pull and customize models.

Supported Models

The extension uses a separate model for chat and code completion. This is due to the fact that different types of models have different strengths, mixing and matching offers the best result.

NOTE - You can use any quantization for a supported model, you are not limited.

Example: deepseek-coder:6.7b-instruct-q4_0

Supported Models for Code Completion:

Supported Models for Chat:

OpenAI

OpenAI is supported! You can use the following models:

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.

Anthropic

Anthropic is supported! You can use the following models:

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.

Hugging Face

Hugging Face supports hosting and training models, but also supports running many models (under 10GB) for free! All you have to do is create a free account.

Setting up Hugging Face

Once you have a Hugging Face account and an API key, all you need to do is open the VSCode settings pane for this extension "Wingman" (see FAQ).

Once it's open, select "HuggingFace" as the AI Provider and add your API key under the HuggingFace section:

Supported Models

The extension uses a separate model for chat and code completion. This is due to the fact that different types of models have different strengths, mixing and matching offers the best result.

Supported Models for Code Completion:

Supported Models for Chat:

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.


FAQ

Troubleshooting

This extension leverages Ollama due to it's simplicity and ability to deliver the right container optimized for your running environment. However good AI performance relies on your machine specs, so if you do not have the ability to GPU accelerate, responses may be slow. During startup the extension will verify the models you have configured in the VSCode settings pane for this extension, the extension does have some defaults:

Code Model - deepseek-coder:6.7b-base-q8_0

Chat Model - deepseek-coder:6.7b-instruct-q8_0

The models above will require enough RAM to run them correctly, you should have at least 12GB of ram on your machine if you are running these models. If you don't have enough ram, then choose a smaller model but be aware that it won't perform as well. Also see information on model Quantization.

Release Notes

To see the latest release notes - check out our releases page.


If you like the extension, please leave a review! If you don't, open an issue and we'd be happy to assist!

Enjoy!