text: "CREATE TABLE system(JobID: String,GID: String, UID: String, Start:Time(yyyy/mm/dd), End: Time,ElapsedRaw: Time, CPUTimeRAW: Time,NCPUS: Number,NNodes: Number, NodeList: List, State:String, Timelimit: Time);Get UID and job id for Jobs that started on Jan 20 , 2023 ended on feb 14 2023 and has job id 20"
example_title: "example"
A 1.3 bn SQL model that outperforms most SQL expert models and chatgpt on popular benchmarks.
This is a distilled model built on the deepseek base model.
How we built it?
We used softmax cross entropy and a modified form of policy grad along with Q loss, optimized in an EM set up.
Loss behaviour in the set up mentioned above -
Benchmarking:
For benchmarking purposes we are using Semantic Evaluation for Text-to-SQL with Distilled Test Suites, an officially accepted evaluation framework for Spider, SParC, and CoSQL which was proposed by a research team of Yale and Berkeley.
The benchmark contains 2200 test data points
Here is the link to run the evaluation:
What are the email address, town and county of the customers who are of the least common gender?
SELECT email_address , town_city , county FROM customers GROUP BY gender_code ORDER BY count(*) ASC LIMIT 1
What are the product price and the product size of the products whose price is above average?
SELECT product_price , product_size FROM products WHERE product_price > (SELECT avg(product_price) FROM products)
Which customers did not make any orders? List the first name, middle initial and last name.
SELECT T1.customer_first_name , T1.customer_middle_initial , T1.customer_last_name FROM Customers AS T1 WHERE T1.customer_id NOT IN (SELECT T2.customer_id FROM Orders AS T2)
### DetailsSimilarity score: 0.94
- [ ] [README.md · defog/sqlcoder-7b-2 at main](https://huggingface.co/defog/sqlcoder-7b-2/blob/main/README.md?code=true)
# README.md · defog/sqlcoder-7b-2 at main
**DESCRIPTION:**
```yaml
license: cc-by-sa-4.0
library_name: transformers
pipeline_tag: text-generation
```
## Update notice
The model weights were updated at 7 AM UTC on Feb 7, 2024. The new model weights lead to a much more performant model – particularly for joins.
If you downloaded the model before that, please redownload the weights for best performance.
## Model Card for SQLCoder-7B-2
A capable large language model for natural language to SQL generation.
![image/png](https://cdn-uploads.huggingface.co/production/uploads/603bbad3fd770a9997b57cb6/AYUE2y14vy2XkD9MZpScu.png)
### Model Details
#### Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- **Developed by:** [Defog, Inc](https://defog.ai)
- **Model type:** [Text to SQL]
- **License:** [CC-by-SA-4.0]
- **Finetuned from model:** [CodeLlama-7B]
#### Model Sources [optional]
- [**HuggingFace:**](https://huggingface.co/defog/sqlcoder-70b-alpha)
- [**GitHub:**](https://github.com/defog-ai/sqlcoder)
- [**Demo:**](https://defog.ai/sqlcoder-demo/)
## Uses
This model is intended to be used by non-technical users to understand data inside their SQL databases. It is meant as an analytics tool, and not as a database admin tool.
This model has not been trained to reject malicious requests from users with write access to databases, and should only be used by users with read-only access.
## How to Get Started with the Model
Use the code [here](https://github.com/defog-ai/sqlcoder/blob/main/inference.py) to get started with the model.
## Prompt
Please use the following prompt for optimal results. Please remember to use `do_sample=False` and `num_beams=4` for optimal results.
```
### Task
Generate a SQL query to answer [QUESTION]{user_question}[/QUESTION]
### Database Schema
The query will run on a database with the following schema:
{table_metadata_string_DDL_statements}
### Answer
Given the database schema, here is the SQL query that [QUESTION]{user_question}[/QUESTION]
[SQL]
```
## Evaluation
This model was evaluated on [SQL-Eval](https://github.com/defog-ai/sql-eval), a PostgreSQL based evaluation framework developed by Defog for testing and alignment of model capabilities.
You can read more about the methodology behind SQLEval [here](https://defog.ai/blog/open-sourcing-sqleval/).
### Results
We classified each generated question into one of 6 categories. The table displays the percentage of questions answered correctly by each model, broken down by category.
| | date | group_by | order_by | ratio | join | where |
| -------------- | ---- | -------- | -------- | ----- | ---- | ----- |
| sqlcoder-70b | 96 | 91.4 | 97.1 | 85.7 | 97.1 | 91.4 |
| sqlcoder-7b-2 | 96 | 91.4 | 94.3 | 91.4 | 94.3 | 77.1 |
| sqlcoder-34b | 80 | 94.3 | 85.7 | 77.1 | 85.7 | 80 |
| gpt-4 | 72 | 94.3 | 97.1 | 80 | 91.4 | 80 |
| gpt-4-turbo | 76 | 91.4 | 91.4 | 62.8 | 88.6 | 77.1 |
| natural-sql-7b | 56 | 88.6 | 85.7 | 60 | 88.6 | 80 |
| sqlcoder-7b | 64 | 82.9 | 74.3 | 54.3 | 74.3 | 74.3 |
| gpt-3.5 | 72 | 77.1 | 82.8 | 34.3 | 65.7 | 71.4 |
| claude-2 | 52 | 71.4 | 74.3 | 57.1 | 65.7 | 62.9 |
## Model Card Contact
Contact us on X at [@defogdata](https://twitter.com/defogdata), or on email at [founders@defog.ai](mailto:founders@defog.ai)
**URL:** [https://huggingface.co/defog/sqlcoder-7b-2/blob/main/README.md?code=true](https://huggingface.co/defog/sqlcoder-7b-2/blob/main/README.md?code=true)
#### Suggested labels
####
383: deepseek-ai/deepseek-coder-5.7bmqa-base · Hugging Face
### DetailsSimilarity score: 0.9
- [ ] [deepseek-ai/deepseek-coder-5.7bmqa-base · Hugging Face](https://huggingface.co/deepseek-ai/deepseek-coder-5.7bmqa-base)
Deepseek Coder Introduction
----------------------------
Deepseek Coder is a series of code language models, each trained from scratch on 2T tokens with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on a project-level code corpus with a window size of 16K and an extra fill-in-the-blank task, supporting project-level code completion and infilling. Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.
### Key Features
- **Massive Training Data:** Trained from scratch on 2T tokens, including 87% code and 13% linguistic data in both English and Chinese languages.
- **Highly Flexible & Scalable:** Offered in model sizes of 1.3B, 5.7B, 6.7B, and 33B, enabling users to choose the setup most suitable for their requirements.
- **Superior Model Performance:** State-of-the-art performance among publicly available code models on HumanEval, MultiPL-E, MBPP, DS-1000, and APPS benchmarks.
- **Advanced Code Completion Capabilities:** A window size of 16K and a fill-in-the-blank task, supporting project-level code completion and infilling tasks.
### Model Summary
- **deepseek-coder-5.7bmqa-base:** A 5.7B parameter model with Multi Query Attention, trained on 2 trillion tokens.
- **Home Page:** [DeepSeek](http://deepseek.com)
- **Repository:** [deepseek-ai/deepseek-coder](https://github.com/deepseek-ai/deepseek-coder)
- **Chat With DeepSeek Coder:** [DeepSeek-Coder](https://github.com/deepseek-ai/deepseek-coder/discussions)
### How to Use
This section provides examples of how to use the Deepseek Coder model for code completion, code insertion, and repository-level code completion tasks.
#### Code Completion
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-5.7bmqa-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-5.7bmqa-base", trust_remote_code=True).cuda()
input_text = "#write a quick sort algorithm"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
#### Code Insertion
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-5.7bmqa-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-5.7bmqa-base", trust_remote_code=True).cuda()
input_text = """<|begin|>def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[0]
left = []
right = []
<|hole|>
if arr[i] < pivot:
left.append(arr[i])
else:
right.append(arr[i])
return quick_sort(left) + [pivot] + quick_sort(right)<|end|>"""
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(input_text):])
```
#### Repository Level Code Completion
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-5.7bmqa-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-5.7bmqa-base", trust_remote_code=True).cuda()
input_text = """#utils.py
import torch
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
def load_data():
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Standardize the data
scaler = StandardScaler()
X = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Convert numpy data to PyTorch tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.int64)
y_test = torch.tensor(y_test, dtype=torch.int64)
return X_train, X_test, y_train, y_test
def evaluate_predictions(y_test, y_pred):
return accuracy_score(y_test, y_pred)
#model.py
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
class IrisClassifier(nn.Module):
def __init__(self):
super(IrisClassifier, self).__init__()
self.fc = nn.Sequential(
nn.Linear(4, 16),
nn.ReLU(),
nn.Linear(16, 3)
)
def forward(self, x):
return self.fc(x)
def train_model(self, X_train, y_train, epochs, lr, batch_size):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(self.parameters(), lr=lr)
# Create DataLoader for batches
dataset = TensorDataset(X_train, y_train)
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
for epoch in range(epochs):
for batch_X, batch_y in dataloader:
optimizer.zero_grad()
outputs = self(batch_X)
loss = criterion(outputs, batch_y)
loss.backward()
optimizer.step()
def predict(self, X_test):
with torch.no_grad():
outputs = self(X_test)
_, predicted = outputs.max(1)
return predicted.numpy()
#main.py
from utils import load_data, evaluate_predictions
from model import IrisClassifier as Classifier
def main():
# Model training and evaluation
"""
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=140)
print(tokenizer.decode(outputs[0]))
```
License
-------
This code repository is licensed under the MIT License. The use of Deepseek Coder models is subject to the Model License. DeepSeek Coder supports commercial use.
See the [LICENSE-MODEL](https://github.com/deepseek-ai/deepseek-coder/blob/main/LICENSE-MODEL) for more details.
Contact
-------
If you have any questions, please raise an issue or contact us at [agi\_code@deepseek.com](mailto:agi_code@deepseek.com).
#### Suggested labels
#### { "key": "llm-experiments", "value": "Experiments and results related to Large Language Models" } { "key": "AI-Chatbots", "value": "Topics related to advanced chatbot platforms integrating multiple AI models" }
498: CodeGPTPlus/deepseek-coder-1.3b-typescript · Hugging Face
### DetailsSimilarity score: 0.9
- [ ] [CodeGPTPlus/deepseek-coder-1.3b-typescript · Hugging Face](https://huggingface.co/CodeGPTPlus/deepseek-coder-1.3b-typescript)
# CodeGPTPlus/deepseek-coder-1.3b-typescript
This is a fine-tuned model by the CodeGPT team, specifically crafted for generating expert code in TypeScript. It is fine-tuned from `deepseek-ai/deepseek-coder-1.3b-base` with a dataset of 0.5B tokens, making it an excellent choice for precise and efficient TypeScript code generation.
The model uses a 16K window size and an additional fill-in-the-middle task for project-level code completion.
## How to Use
This model is for completion purposes only. Here are some examples of how to use the model:
### Running the model on a GPU
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("CodeGPTPlus/deepseek-coder-1.3b-typescript", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("CodeGPTPlus/deepseek-coder-1.3b-typescript", trust_remote_code=True).cuda()
input_text = """<|fim begin|>function quickSort(arr: number[]): number[] {
if (arr.length <= 1) {
return arr;
}
const pivot = arr[0];
const left = [];
const right = [];
<|fim hole|>
return [...quickSort(left), pivot, ...quickSort(right)];
}<|fim end|>"""
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Running with Ollama
- Model: [https://ollama.ai/codegpt/deepseek-coder-1.3b-typescript](https://ollama.ai/codegpt/deepseek-coder-1.3b-typescript)
- Command: `ollama run codegpt/deepseek-coder-1.3b-typescript`
### Running with Ollama and CodeGPT Autocomplete in VSCode
- Documentation: [https://docs.codegpt.co/docs/tutorial-features/code\_autocompletion](https://docs.codegpt.co/docs/tutorial-features/code_autocompletion)
- Select "Ollama - codegpt/deepseek-coder-1.3b-typescript" in the autocomplete model selector.
### Fill In the Middle (FIM)
```python
<|fim begin|>function quickSort(arr: number[]): number[] {
if (arr.length <= 1) {
return arr;
}
const pivot = arr[0];
const left = [];
const right = [];
<|fim hole|>
return [...quickSort(left), pivot, ...quickSort(right)];
}<|fim end|>
```
### Training Procedure
The model was trained using the following hyperparameters:
- learning\_rate: 2e-05
- train\_batch\_size: 20
- eval\_batch\_size: 20
- seed: 42
- gradient\_accumulation\_steps: 2
- total\_train\_batch\_size: 40
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
- lr\_scheduler\_type: cosine
- lr\_scheduler\_warmup\_steps: 261
- num\_epochs: 1
For more information, visit the [model page](https://huggingface.co/CodeGPTPlus/deepseek-coder-1.3b-typescript).
#### Suggested labels
#### { "label-name": "TypeScript-Code-Generation", "description": "Model for generating TypeScript code", "repo": "CodeGPTPlus/deepseek-coder-1.3b-typescript", "confidence": 70.59 }
324: bigcode/tiny_starcoder_py · Hugging Face
### DetailsSimilarity score: 0.9
> **Note:**
>
> [bigcode/tiny_starcoder_py · Hugging Face](https://huggingface.co/bigcode/tiny_starcoder_py)
>
> TinyStarCoderPy
>
> This is a 164M parameters model with the same architecture as StarCoder (8k context length, MQA & FIM). It was trained on the Python data from StarCoderData for ~6 epochs which amounts to 100B tokens.
>
> Use
>
> Intended use
>
> The model was trained on GitHub code, to assist with some tasks like Assisted Generation. For pure code completion, we advise using our 15B models StarCoder or StarCoderBase.
>
> Generation
>
> ```python
> # pip install -q transformers
> from transformers import AutoModelForCausalLM, AutoTokenizer
>
> checkpoint = "bigcode/tiny_starcoder_py"
> device = "cuda" # for GPU usage or "cpu" for CPU usage
>
> tokenizer = AutoTokenizer.from_pretrained(checkpoint)
> model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
>
> inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device)
> outputs = model.generate(inputs)
> print(tokenizer.decode(outputs[0]))
> ```
>
> Fill-in-the-middle
>
> Fill-in-the-middle uses special tokens to identify the prefix/middle/suffix part of the input and output:
>
> ```python
> input_text = "def print_one_two_three():\n print('one')\n \n print('three')"
> inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
> outputs = model.generate(inputs)
> print(tokenizer.decode(outputs[0]))
> ```
>
> Training
>
> Model
>
> - Architecture: GPT-2 model with multi-query attention and Fill-in-the-Middle objective
> - Pretraining steps: 50k
> - Pretraining tokens: 100 billion
> - Precision: bfloat16
>
> Hardware
>
> - GPUs: 32 Tesla A100
> - Training time: 18 hours
>
> Software
>
> - Orchestration: Megatron-LM
> - Neural networks: PyTorch
> - BP16 if applicable: apex
>
> License
>
> The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can find the full agreement [here](https://huggingface.co/bigcode/tiny_starcoder_py/blob/main/LICENSE).
>
> #### Suggested labels
>
> - { "key": "llm-pretraining", "value": "Information related to the pretraining process of Large Language Models" }
499: marella/ctransformers: Python bindings for the Transformer models implemented in C/C++ using GGML library.
### DetailsSimilarity score: 0.89
- [ ] [marella/ctransformers: Python bindings for the Transformer models implemented in C/C++ using GGML library.](https://github.com/marella/ctransformers?tab=readme-ov-file#gptq)
# CTransformers
[![PyPI version](https://badge.fury.io/py/ctransformers.svg)](https://badge.fury.io/py/ctransformers)
[![Documentation](https://readthedocs.org/images/button/readthedocs-ci.svg)](https://ctransformers.readthedocs.io/)
[![Build and Test](https://github.com/ marella / ctransformers / actions / workflows / build.yml / badge.svg)](https://github.com/marella/ctransformers/actions/workflows/build.yml)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
Python bindings for the Transformer models implemented in C/C++ using GGML library. Also see [ChatDocs](https://github.com/marella/chatdocs)
## Supported Models
| Model | Model Type | CUDA | Metal |
| ------ | --------- | :--: | :--: |
| GPT-2 | gpt2 | | |
| GPT-J, GPT4All-J | gptj | | |
| GPT-NeoX, StableLM | gpt_neox | | |
| Falcon | falcon | ✅ | |
| LLaMA, LLaMA 2 | llamai | ✅ | ✅ |
| MPT | mpt | ✅ | |
| StarCoder, StarChat | gpt_bigcode | ✅ | |
| Dolly V2 | dolly-v2 | | |
| Replit | replit | | |
## Installation
To install via `pip`, simply run:
```
pip install ctransformers
```
## Usage
It provides a unified interface for all models:
```python
from ctransformers import AutoModelForCausalLM
llm = AutoModelForCausalLM.from_pretrained("/path/to/ggml-model.bin", model_type="gpt2")
print(llm("AI is going to"))
```
Run in Google Colab
To stream the output:
```python
for text in llm("AI is going to", stream=True):
print(text, end="", flush=True)
```
You can load models from Hugging Face Hub directly:
```python
llm = AutoModelForCausalLM.from_pretrained("marella/gpt-2-ggml")
```
If a model repo has multiple model files (`.bin` or `.gguf` files), specify a model file using:
```python
llm = AutoModelForCausalLM.from_pretrained("marella/gpt-2-ggml", model_file="ggml-model.bin")
```
### 🤗 Transformers
Note: This is an experimental feature and may change in the future.
To use with 🤗 Transformers, create the model and tokenizer using:
```python
from ctransformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("marella/gpt-2-ggml", hf=True)
tokenizer = AutoTokenizer.from_pretrained(model)
```
Run in Google Colab
You can use 🤗 Transformers text generation pipeline:
```python
from transformers import pipeline
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
print(pipe("AI is going to", max_new_tokens=256))
```
You can use 🤗 Transformers generation parameters:
```python
pipe("AI is going to", max_new_tokens=256, do_sample=True, temperature=0.8, repetition_penalty=1.1)
```
You can use 🤗 Transformers tokenizers:
```python
from ctransformers import AutoModelForCausalLM
from transformers import AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("marella/gpt-2-ggml", hf=True) # Load model from GGML model repo.
tokenizer = AutoTokenizer.from_pretrained("gpt2") # Load tokenizer from original model repo.
```
### LangChain
It is integrated into LangChain. See LangChain [docs](https://github.com/LangChainAI/langchain#using-ctransformers-backed-models).
### GPU
To run some of the model layers on GPU, set the `gpu_layers` parameter:
```python
llm = AutoModelForCausalLM.from_pretrained("TheBloke/Llama-2-7B-GGML", gpu_layers=50)
```
Run in Google Colab
#### CUDA
Install CUDA libraries using:
```bash
pip install ctransformers[cuda]
```
#### ROCm
To enable ROCm support, install the `ctransformers` package using:
```bash
CT_HIPBLAS=1 pip install ctransformers --no-binary ctransformers
```
#### Metal
To enable Metal support, install the `ctransformers` package using:
```bash
CT_METAL=1 pip install ctransformers --no-binary ctransformers
```
### GPTQ
Note: This is an experimental feature and only LLaMA models are supported using [ExLlama](https
://github.com/TheLastBen/exllama).
Install additional dependencies using:
```bash
pip install ctransformers[gptq]
```
Load a GPTQ model using:
```python
llm = AutoModelForCausalLM.from_pretrained("TheBloke/Llama-2-7B-GPTQ")
```
Run in Google Colab
If the model name or path doesn't contain the word `gptq`, specify `model_type="gptq"`.
It can also be used with LangChain. Low-level APIs are not fully supported.
## Documentation
Find the documentation on [Read the Docs](https://ctransformers.readthedocs.io/).
#### Config
| Parameter | Type | Description | Default |
| --------- | ------ | ------------------------------------------------------------ | ------- |
| `top_k` | `int` | The top-k value to use for sampling | `40` |
| `top_p` | `float` | The top-p value to use for sampling | `0.95` |
| `temperature` | `float` | The temperature to use for sampling | `0.8` |
| `repetition_penalty` | `float` | The repetition penalty to use for sampling | `1.1` |
| `last_n_tokens` | `int` | The number of last tokens to use for repetition penalty | `64` |
| `seed` | `int` | The seed value to use for sampling tokens | `-1` |
| `max_new_tokens` | `int` | The maximum number of new tokens to generate | `256` |
| `stop` | `List` | A list of sequences to stop generation when encountered | `None` |
| `stream` | `bool` | Whether to stream the generated text | `False` |
| `reset` | `bool` | Whether to reset the model state before generating text | `True` |
| `batch_size` | `int` | The batch size to use for evaluating tokens in a single prompt | `8` |
| `threads` | `int` | The number of threads to use for evaluating tokens | `-1` |
| `context_length` | `int` | The maximum context length to use | `-1` |
| `gpu_layers` | `int` | The number of layers to run on GPU | `0` |
Find the URL for the model card for GPTQ [here](https://github.com/marella/ctransformers?tab=readme-ov-file#gptq).
---
Made with ❤️ by [marella](https://github.com/marella)
#### Suggested labels
#### null
625: unsloth/README.md at main · unslothai/unsloth
### DetailsSimilarity score: 0.89
- [ ] [unsloth/README.md at main · unslothai/unsloth](https://github.com/unslothai/unsloth/blob/main/README.md?plain=1)
# unsloth/README.md at main · unslothai/unsloth
### Finetune Mistral, Gemma, Llama 2-5x faster with 70% less memory!
![](https://i.ibb.co/sJ7RhGG/image-41.png)
## ✨ Finetune for Free
All notebooks are **beginner friendly**! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.
| Unsloth supports | Free Notebooks | Performance | Memory use |
|-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------|
| **Gemma 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing) | 2.4x faster | 58% less |
| **Mistral 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/1Dyauq4kTZoLewQ1cApceUQVNcnnNTzg_?usp=sharing) | 2.2x faster | 62% less |
| **Llama-2 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/1lBzz5KeZJKXjvivbYvmGarix9Ao6Wxe5?usp=sharing) | 2.2x faster | 43% less |
| **TinyLlama** | [▶️ Start on Colab](https://colab.research.google.com/drive/1AZghoNBQaMDgWJpi4RbffGM1h6raLUj9?usp=sharing) | 3.9x faster | 74% less |
| **CodeLlama 34b** A100 | [▶️ Start on Colab](https://colab.research.google.com/drive/1y7A0AxE3y8gdj4AVkl2aZX47Xu3P1wJT?usp=sharing) | 1.9x faster | 27% less |
| **Mistral 7b** 1xT4 | [▶️ Start on Kaggle](https://www.kaggle.com/code/danielhanchen/kaggle-mistral-7b-unsloth-notebook) | 5x faster\* | 62% less |
| **DPO - Zephyr** | [▶️ Start on Colab](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) | 1.9x faster | 19% less |
- This [conversational notebook](https://colab.research.google.com/drive/1Aau3lgPzeZKQ-98h69CCu1UJcvIBLmy2?usp=sharing) is useful for ShareGPT ChatML / Vicuna templates.
- This [text completion notebook](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing) is for raw text. This [DPO notebook](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) replicates Zephyr.
- \* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster.
## 🦥 Unsloth.ai News
- 📣 [Gemma 7b](https://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing) on 6T tokens now works. And [Gemma 2b notebook](https://colab.research.google.com/drive/15gGm7x_jTm017_Ic8e317tdIpDG53Mtu?usp=sharing)
- 📣 Added [conversational notebooks](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing) and [raw text notebooks](https://colab.research.google.com/drive/1bMOKOBzxQWUIGZBs_B0zm8pimuEnZdfM?usp=sharing)
- 📣 [2x faster inference](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) added for all our models
- 📣 [DPO support](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) is now included. [More info](#DPO) on DPO
- 📣 We did a [blog](https://huggingface.co/blog/unsloth-trl) with 🤗Hugging Face and are in their official docs! Check out the [SFT docs](https://huggingface.co/docs/trl/main/en/sft_trainer#accelerate-fine-tuning-2x-using-unsloth) and [DPO docs](https://huggingface.co/docs/trl/main/en/dpo_trainer#accelerate-dpo-fine-tuning-using-unsloth)
- 📣 [Download models 4x faster](https://huggingface.co/collections/unsloth/) from 🤗Hugging Face. Eg: `unsloth/mistral-7b-bnb-4bit`
## 🔗 Links and Resources
| Type | Links |
| ------------------------------- | --------------------------------------- |
| 📚 **Wiki & FAQ** | [Read Our Wiki](https://github.com/unslothai/unsloth/wiki) |
| 📜 **Documentation** | [Read The Doc](https://github.com/unslothai/unsloth/tree/main#-documentation) |
| 💾 **Installation** | [unsloth/README.md](https://github.com/unslothai/unsloth/tree/main#installation-instructions)|
| **Twitter (aka X)** | [Follow us on X](https://twitter.com/unslothai)|
| 🥇 **Benchmarking** | [Performance Tables](https://github.com/unslothai/unsloth/tree/main#-performance-benchmarking)
| 🌐 **Released Models** | [Unsloth Releases](https://huggingface.co/unsloth)|
| ✍️ **Blog** | [Read our Blogs](https://unsloth.ai/blog)|
## ⭐ Key Features
- All kernels written in [OpenAI's Triton](https://openai.com/research/triton) language. **Manual backprop engine**.
- **0% loss in accuracy** - no approximation methods - all exact.
- No change of hardware. Supports NVIDIA GPUs since 2018+. Minimum CUDA Capability 7.0 (V100, T4, Titan V, RTX 20, 30, 40x, A100, H100, L40 etc) [Check your GPU!](https://developer.nvidia.com/cuda-gpus) GTX 1070, 1080 works, but is slow.
- Works on **Linux** and **Windows** via WSL.
- Supports 4bit and 16bit QLoRA / LoRA finetuning via [bitsandbytes](https://github.com/TimDettmers/bitsandbytes).
- Open source trains 5x faster - see [Unsloth Pro](https://unsloth.ai/) for **30x faster training**!
- If you trained a model with 🦥Unsloth, you can use this cool sticker!
## 🥇 Performance Benchmarking
- For the full list of **reproducable** benchmarking tables, [go to our website](https://unsloth.ai/blog/mistral-benchmark#Benchmark%20tables)
| 1 A100 40GB | 🤗Hugging Face | Flash Attention | 🦥Unsloth Open Source | 🦥[Unsloth Pro](https://unsloth.ai/pricing) |
|--------------|--------------|-----------------|---------------------|-----------------|
| Alpaca | 1x | 1.04x | 1.98x | **15.64x** |
| LAION Chip2 | 1x | 0.92x | 1.61x | **20.73x** |
| OASST | 1x | 1.19x | 2.17x | **14.83x** |
| Slim Orca | 1x | 1.18x | 2.22x | **14.82x** |
- Benchmarking table below was conducted by [🤗Hugging Face](https://huggingface.co/blog/unsloth-trl).
| Free Colab T4 | Dataset | 🤗Hugging Face | Pytorch 2.1.1 | 🦥Unsloth | 🦥 VRAM reduction |
| --- | --- | --- | --- | --- | --- |
| Llama-2 7b | OASST | 1x | 1.19x | 1.95x | -43.3% |
| Mistral 7b | Alpaca | 1x | 1.07x | 1.56x | -13.7% |
| Tiny Llama 1.1b | Alpaca | 1x | 2.06x | 3.87x | -73.8% |
| DPO with Zephyr | Ultra Chat | 1x | 1.09x | 1.55x | -18.6% |
![](https://i.ibb.co/sJ7RhGG/image-41.png)
[View on GitHub](https://github.com/unslothai/unsloth/blob/main/README.md?plain=1)
#### Suggested labels
####
README.md · PipableAI/pip-sql-1.3b at main
DESCRIPTION:
pipSQL-1.3b
pipableAi
colab_notebook
What have we built?
A 1.3 bn SQL model that outperforms most SQL expert models and chatgpt on popular benchmarks. This is a distilled model built on the deepseek base model.
How we built it?
We used softmax cross entropy and a modified form of policy grad along with Q loss, optimized in an EM set up. Loss behaviour in the set up mentioned above -
Benchmarking:
For benchmarking purposes we are using Semantic Evaluation for Text-to-SQL with Distilled Test Suites, an officially accepted evaluation framework for Spider, SParC, and CoSQL which was proposed by a research team of Yale and Berkeley. The benchmark contains 2200 test data points Here is the link to run the evaluation:
Test Suite SQL Eval
We have also benchmarked it on defog eval. It contains 200 test data points handpicked by defog team. Here is the link to it:
Defog SQL-Eval These are the results -
License
The model is open source under apache 2.0. License
Usage
Installation
Prompt
PyTorch
Flax
TensorFlow
Examples
Schema
Questions
What are the email address, town and county of the customers who are of the least common gender?
What are the product price and the product size of the products whose price is above average?
Which customers did not make any orders? List the first name, middle initial and last name.
Team
Avi Kothari, Pratham Gupta, Ritvik Aryan Kalra, Rohan Bhatial, Soham Acharya
URL: PipableAI/pip-sql-1.3b
Suggested labels