smallcloudai / refact

WebUI for Fine-Tuning and Self-hosting of Open-Source Large Language Models for Coding
https://refact.ai
BSD 3-Clause "New" or "Revised" License
1.5k stars 100 forks source link
ai autocompletion chat developer-tools devtools fine-tuning llama2 llms refactoring self-hosted starchat starcoder wizardlm

Black Refact Logo

This repo consists Refact WebUI for fine-tuning and self-hosting of code models, that you can later use inside Refact plugins for code completion and chat. --- [![Discord](https://img.shields.io/discord/1037660742440194089?logo=discord&label=Discord&link=https%3A%2F%2Fsmallcloud.ai%2Fdiscord)](https://smallcloud.ai/discord) [![Twitter Follow](https://img.shields.io/twitter/follow/refact_ai)](https://twitter.com/intent/follow?screen_name=refact_ai) ![License](https://img.shields.io/github/license/smallcloudai/refact?cacheSeconds=1000) [![Visual Studio](https://img.shields.io/visual-studio-marketplace/d/smallcloud.codify?label=VS%20Code)](https://marketplace.visualstudio.com/items?itemName=smallcloud.codify) [![JetBrains](https://img.shields.io/jetbrains/plugin/d/com.smallcloud.codify?label=JetBrains)](https://plugins.jetbrains.com/plugin/20647-codify) - [x] Fine-tuning of open-source code models - [x] Self-hosting of open-source code models - [x] Download and upload Lloras - [x] Use models for code completion and chat inside Refact plugins - [x] Model sharding - [x] Host several small models on one GPU - [x] Use OpenAI and Anthropic keys to connect GPT-models for chat ![self-hosting-refact](https://github.com/smallcloudai/refact/assets/5008686/18e48b42-b638-4606-bde0-cadd47fd26e7) ### Running Refact Self-Hosted in a Docker Container The easiest way to run the self-hosted server is a pre-build Docker image. Install [Docker with NVidia GPU support](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker). On Windows you need to install WSL 2 first, [one guide to do this](https://docs.docker.com/desktop/install/windows-install). Run docker container with following command: ```commandline docker run -d --rm --gpus all --shm-size=256m -p 8008:8008 -v refact-perm-storage:/perm_storage smallcloud/refact_self_hosting:latest ``` `perm-storage` is a volume that is mounted inside the container. All the configuration files, downloaded weights and logs are stored here. To upgrade the docker, delete it using `docker kill XXX` (the volume `perm-storage` will retain your data), run `docker pull smallcloud/refact_self_hosting` and run it again. Now you can visit http://127.0.0.1:8008 to see the server Web GUI.
Docker commands super short refresher Add your yourself to docker group to run docker without sudo (works for Linux): ```commandline sudo usermod -aG docker {your user} ``` List all containers: ```commandline docker ps -a ``` Start and stop existing containers (stop doesn't remove them): ```commandline docker start XXX docker stop XXX ``` Shows messages from a container: ```commandline docker logs -f XXX ``` Remove a container and all its data (except data inside a volume): ```commandline docker rm XXX ``` Check out or delete a docker volume: ```commandline docker volume inspect VVV docker volume rm VVV ```
See [CONTRIBUTING.md](CONTRIBUTING.md) for installation without a docker container. ### Setting Up Plugins Download Refact for [VS Code](https://marketplace.visualstudio.com/items?itemName=smallcloud.codify) or [JetBrains](https://plugins.jetbrains.com/plugin/20647-refact-ai). Go to plugin settings and set up a custom inference URL `http://127.0.0.1:8008`
JetBrains Settings > Tools > Refact.ai > Advanced > Inference URL
VSCode Extensions > Refact.ai Assistant > Settings > Infurl
## Supported models | Model | Completion | Chat | Fine-tuning | [Deprecated](## "Will be removed in next versions") | |---------------------------------------------------------------------------------------------------|------------|------|-------------|-----------------------------------------------------| | [Refact/1.6B](https://huggingface.co/smallcloudai/Refact-1_6B-fim) | + | | + | | | [starcoder/1b/base](https://huggingface.co/smallcloudai/starcoderbase-1b) | + | | + | + | | [starcoder/3b/base](https://huggingface.co/smallcloudai/starcoderbase-3b) | + | | + | + | | [starcoder/7b/base](https://huggingface.co/smallcloudai/starcoderbase-7b) | + | | + | + | | [starcoder/15b/base](https://huggingface.co/TheBloke/starcoder-GPTQ) | + | | | + | | [starcoder/15b/plus](https://huggingface.co/TheBloke/starcoderplus-GPTQ) | + | | | + | | [starcoder2/3b/base](https://huggingface.co/bigcode/starcoder2-3b) | + | | + | | | [starcoder2/7b/base](https://huggingface.co/bigcode/starcoder2-7b) | + | | + | | | [starcoder2/15b/base](https://huggingface.co/bigcode/starcoder2-15b) | + | | + | | | [wizardcoder/15b](https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GPTQ) | + | | | + | | [codellama/7b](https://huggingface.co/TheBloke/CodeLlama-7B-fp16) | + | | + | + | | [starchat/15b/beta](https://huggingface.co/TheBloke/starchat-beta-GPTQ) | | + | | + | | [wizardlm/7b](https://huggingface.co/TheBloke/WizardLM-7B-V1.0-Uncensored-GPTQ) | | + | | + | | [wizardlm/13b](https://huggingface.co/TheBloke/WizardLM-13B-V1.1-GPTQ) | | + | | + | | [wizardlm/30b](https://huggingface.co/TheBloke/WizardLM-30B-fp16) | | + | | + | | [llama2/7b](https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ) | | + | | + | | [llama2/13b](https://huggingface.co/TheBloke/Llama-2-13B-chat-GPTQ) | | + | | + | | [deepseek-coder/1.3b/base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) | + | | + | | | [deepseek-coder/5.7b/mqa-base](https://huggingface.co/deepseek-ai/deepseek-coder-5.7bmqa-base) | + | | + | | | [magicoder/6.7b](https://huggingface.co/TheBloke/Magicoder-S-DS-6.7B-GPTQ) | | + | | | | [mistral/7b/instruct-v0.1](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GPTQ) | | + | | | | [mixtral/8x7b/instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) | | + | | | | [deepseek-coder/6.7b/instruct](https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GPTQ) | | + | | | | [deepseek-coder/33b/instruct](https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct) | | + | | | | [stable/3b/code](https://huggingface.co/stabilityai/stable-code-3b) | + | | | | | [llama3/8b/instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | | + | | | ## Usage Refact is free to use for individuals and small teams under BSD-3-Clause license. If you wish to use Refact for Enterprise, please [contact us](https://refact.ai/contact/). ## Custom installation You can also install refact repo without docker: ```shell pip install . ``` If you have a GPU with CUDA capability >= 8.0, you can also install it with flash-attention v2 support: ```shell FLASH_ATTENTION_FORCE_BUILD=TRUE MAX_JOBS=4 INSTALL_OPTIONAL=TRUE pip install . ``` ## FAQ Q: Can I run a model on CPU? A: it doesn't run on CPU yet, but it's certainly possible to implement this. ## Community & Support - Contributing [CONTRIBUTING.md](CONTRIBUTING.md) - [GitHub issues](https://github.com/smallcloudai/refact/issues) for bugs and errors - [Community forum](https://github.com/smallcloudai/refact/discussions) for community support and discussions - [Discord](https://www.smallcloud.ai/discord) for chatting with community members - [Twitter](https://twitter.com/refact_ai) for product news and updates