How to load ur gemma model

Hi there,When I load the model by fastchat or ollama there was the unrecognizable characters back to me, here is the example

### Instructions:
Your task is to convert a question into a SQL query, given a Postgres database schema.
Adhere to these rules:
- **Deliberately go through the question and database schema word by word** to appropriately answer the question
- **Use Table Aliases** to prevent ambiguity. For example, `SELECT table1.col1, table2.col1 FROM table1 JOIN table2 ON table1.id = table2.id`.
- When creating a ratio, always cast the numerator as float
### Input:
Generate a SQL query that answers the question `What product has the biggest fall in sales in 2022 compared to 2021? Give me the product name, the sales amount in both years, and the difference.`.
This query will run on a database whose schema is represented in this string:
CREATE TABLE products (
  product_id INTEGER PRIMARY KEY, -- Unique ID for each product
  name VARCHAR(50), -- Name of the product
  price DECIMAL(10,2), -- Price of each unit of the product
  quantity INTEGER  -- Current quantity in stock
);
CREATE TABLE customers (
   customer_id INTEGER PRIMARY KEY, -- Unique ID for each customer
   name VARCHAR(50), -- Name of the customer
   address VARCHAR(100) -- Mailing address of the customer
);
CREATE TABLE salespeople (
  salesperson_id INTEGER PRIMARY KEY, -- Unique ID for each salesperson
  name VARCHAR(50), -- Name of the salesperson
  region VARCHAR(50) -- Geographic sales region
);
CREATE TABLE sales (
  sale_id INTEGER PRIMARY KEY, -- Unique ID for each sale
  product_id INTEGER, -- ID of product sold
  customer_id INTEGER,  -- ID of customer who made purchase
  salesperson_id INTEGER, -- ID of salesperson who made the sale
  sale_date DATE, -- Date the sale occurred
  quantity INTEGER -- Quantity of product sold
);
CREATE TABLE product_suppliers (
  supplier_id INTEGER PRIMARY KEY, -- Unique ID for each supplier
  product_id INTEGER, -- Product ID supplied
  supply_price DECIMAL(10,2) -- Unit price charged by supplier
);
-- sales.product_id can be joined with products.product_id
-- sales.customer_id can be joined with customers.customer_id
-- sales.salesperson_id can be joined with salespeople.salesperson_id
-- product_suppliers.product_id can be joined with products.product_id
### Response:
Based on your instructions, here is the SQL query I have generated to answer the question `What product has the biggest fall in sales in 2022 compared to 2021? Give me the product name, the sales amount in both years, and the difference.`:

and the response of this prompt is

{
    "id": "chatcmpl-xymUktbgLjiVSbnBSttHj9",
    "object": "chat.completion",
    "created": 1712468851,
    "model": "sqlgemma",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\n\nThis is the sql\nA\n 3333333333333333555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555577777778889999888888877777777000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 6939,
        "total_tokens": 8190,
        "completion_tokens": 1251
    }
}

Is there something wrong when i load the model?

loading model ~

model_id = "aryachakraborty/GEMMA-2B-NL-SQL"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id, token=os.environ['HF_TOKEN'])
model = AutoModelForCausalLM.from_pretrained(model_id,
                                             quantization_config=bnb_config,
                                             device_map={"":0},
                                             token=os.environ['HF_TOKEN'])

here if you are working locally be sure to check that you are using GPU , if not then load the complete model.

Inference ~

Instructions = """Give the NAME who has the highest SALARY"""

Input = """CREATE TABLE `sample` (
  `NAME` text,
  `SALARY` int DEFAULT NULL,
  `STATE` text
)"""

alpeca_prompt = f"""Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: {Instructions}. ### Input: {Input}
### Response:
"""

alpeca_prompt.format(
  Instructions,
  Input)
device = "cuda:0"
inputs = tokenizer(alpeca_prompt, return_tensors="pt").to(device)

outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Output ~

Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: Give the NAME who has the highest SALARY. ### Input: CREATE TABLE `sample` (
  `NAME` text,
  `SALARY` int DEFAULT NULL,
  `STATE` text
)
### Response:
SELECT `NAME` FROM `sample` ORDER BY `SALARY` DESC LIMIT 1

hope it helps 🏷️. If any doubt fell free to connect

loading model ~

model_id = "aryachakraborty/GEMMA-2B-NL-SQL"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id, token=os.environ['HF_TOKEN'])
model = AutoModelForCausalLM.from_pretrained(model_id,
                                             quantization_config=bnb_config,
                                             device_map={"":0},
                                             token=os.environ['HF_TOKEN'])

here if you are working locally be sure to check that you are using GPU , if not then load the complete model.

Inference ~

Instructions = """Give the NAME who has the highest SALARY"""

Input = """CREATE TABLE `sample` (
  `NAME` text,
  `SALARY` int DEFAULT NULL,
  `STATE` text
)"""

alpeca_prompt = f"""Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: {Instructions}. ### Input: {Input}
### Response:
"""

alpeca_prompt.format(
  Instructions,
  Input)
device = "cuda:0"
inputs = tokenizer(alpeca_prompt, return_tensors="pt").to(device)

outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Output ~

Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: Give the NAME who has the highest SALARY. ### Input: CREATE TABLE `sample` (
  `NAME` text,
  `SALARY` int DEFAULT NULL,
  `STATE` text
)
### Response:
SELECT `NAME` FROM `sample` ORDER BY `SALARY` DESC LIMIT 1

hope it helps 🏷️. If any doubt fell free to connect

Okay, THX SO MUCH

Hi there，Sorry to interrupt you again，I use the method that you gave to me yesterday.

Instructions = """Give the NAME who has the highest SALARY"""

Input = """CREATE TABLE `sample` (
  `NAME` text,
  `SALARY` int DEFAULT NULL,
  `STATE` text
)"""

alpeca_prompt = f"""Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: {Instructions}. ### Input: {Input}
### Response:
"""

alpeca_prompt.format(
  Instructions,
  Input)

the overall example is worked but if I change the long ddl text about 8120 tokens, the response is change to

### Response:

 below, VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR, 11111

if i change the part of outputs = model.generate(**inputs, max_new_tokens=20) to outputs = model.generate(**inputs, max_new_tokens=200) it will gave me more response like VARCHAR, How 2 deal with this problem, Thank you for your response.

actually I have fine tuned the Gemma model for texts token size <=512 (because of limited resources). So this token size includes the complete input to the model including instruction & input. If you want to get accurate results for longer texts then you have to fine tune the model on such sample. It's just a demo model, should not be used for long input context.

This is the data I used Data (sample of less token length)

The actual source of this data is source this data contains longer input sequence, so if you want to get accurate results then feel free to fine tune on this whole data.

If you want help on that or any other query feel free to reach me.

actually I have fine tuned the Gemma model for texts token size <=512 (because of limited resources). So this token size includes the complete input to the model including instruction & input. If you want to get accurate results for longer texts then you have to fine tune the model on such sample. It's just a demo model, should not be used for long input context.

This is the data I used Data (sample of less token length)

The actual source of this data is source this data contains longer input sequence, so if you want to get accurate results then feel free to fine tune on this whole data.

If you want help on that or any other query feel free to reach me.

👌，Thank you for ur response!

Arya920 / Querio_Lingua-NL-to-SQL