defog-ai / sqlcoder

SoTA LLM for converting natural language questions to SQL queries
Apache License 2.0
3.27k stars 205 forks source link

defog/sqlcoder-70b-alpha not answering #65

Closed sanipanwala closed 7 months ago

sanipanwala commented 7 months ago

Hello,

I just downloaded the model defog/sqlcoder-70b-alpha and gave the same prompt mentioned in your example (colab) but I'm not sure why the result is still not showing after 10 minutes.

prompt = """### Task
Generate a SQL query to answer the following question:
`{question}`

### Database Schema
This query will run on a database whose schema is represented in this string:
CREATE TABLE products (
  product_id INTEGER PRIMARY KEY, -- Unique ID for each product
  name VARCHAR(50), -- Name of the product
  price DECIMAL(10,2), -- Price of each unit of the product
  quantity INTEGER  -- Current quantity in stock
);

CREATE TABLE customers (
   customer_id INTEGER PRIMARY KEY, -- Unique ID for each customer
   name VARCHAR(50), -- Name of the customer
   address VARCHAR(100) -- Mailing address of the customer
);

CREATE TABLE salespeople (
  salesperson_id INTEGER PRIMARY KEY, -- Unique ID for each salesperson
  name VARCHAR(50), -- Name of the salesperson
  region VARCHAR(50) -- Geographic sales region
);

CREATE TABLE sales (
  sale_id INTEGER PRIMARY KEY, -- Unique ID for each sale
  product_id INTEGER, -- ID of product sold
  customer_id INTEGER,  -- ID of customer who made purchase
  salesperson_id INTEGER, -- ID of salesperson who made the sale
  sale_date DATE, -- Date the sale occurred
  quantity INTEGER -- Quantity of product sold
);

CREATE TABLE product_suppliers (
  supplier_id INTEGER PRIMARY KEY, -- Unique ID for each supplier
  product_id INTEGER, -- Product ID supplied
  supply_price DECIMAL(10,2) -- Unit price charged by supplier
);

-- sales.product_id can be joined with products.product_id
-- sales.customer_id can be joined with customers.customer_id
-- sales.salesperson_id can be joined with salespeople.salesperson_id
-- product_suppliers.product_id can be joined with products.product_id

### SQL
Given the database schema, here is the SQL query that answers `{question}`:
```sql
""".format(question=question)

Thanks.

rishsriv commented 7 months ago

Hi there, you won't be able to use the 70b model in colab because of memory and disk size constraints, unless you're using an enterprise version of colab with significant VRAM?

Could you share what machine type you're using on colab?

sanipanwala commented 7 months ago

Hello @rishsriv ,

I'm using a personal workstation.

workstation is configured with Ubuntu with 24 GB* 2 Nvidia GPUs and physical memory is 192 GB.

Thanks.

rishsriv commented 7 months ago

Hi there, you will have to add a load_in_4bit=true parameter to load this in 48GB of VRAM

model = AutoModelForCausalLM.from_pretrained(
        "https://huggingface.co/defog/sqlcoder-70b-alpha",
        device_map="auto",
        load_in_4bit=true
    )

With that, you should be able to run this in the VRAM that you have! It will still be slower than running it in fp16, but much faster than running on CPU (which, I suspect, is that was happening earlier)