Closed gabrielpondc closed 7 months ago
model_id = "aryachakraborty/GEMMA-2B-NL-SQL"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained(model_id, token=os.environ['HF_TOKEN'])
model = AutoModelForCausalLM.from_pretrained(model_id,
quantization_config=bnb_config,
device_map={"":0},
token=os.environ['HF_TOKEN'])
here if you are working locally be sure to check that you are using GPU , if not then load the complete model.
Instructions = """Give the NAME who has the highest SALARY"""
Input = """CREATE TABLE `sample` (
`NAME` text,
`SALARY` int DEFAULT NULL,
`STATE` text
)"""
alpeca_prompt = f"""Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: {Instructions}. ### Input: {Input}
### Response:
"""
alpeca_prompt.format(
Instructions,
Input)
device = "cuda:0"
inputs = tokenizer(alpeca_prompt, return_tensors="pt").to(device)
outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: Give the NAME who has the highest SALARY. ### Input: CREATE TABLE `sample` (
`NAME` text,
`SALARY` int DEFAULT NULL,
`STATE` text
)
### Response:
SELECT `NAME` FROM `sample` ORDER BY `SALARY` DESC LIMIT 1
hope it helps 🏷️. If any doubt fell free to connect
loading model ~
model_id = "aryachakraborty/GEMMA-2B-NL-SQL" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 ) tokenizer = AutoTokenizer.from_pretrained(model_id, token=os.environ['HF_TOKEN']) model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"":0}, token=os.environ['HF_TOKEN'])
here if you are working locally be sure to check that you are using GPU , if not then load the complete model.
Inference ~
Instructions = """Give the NAME who has the highest SALARY""" Input = """CREATE TABLE `sample` ( `NAME` text, `SALARY` int DEFAULT NULL, `STATE` text )""" alpeca_prompt = f"""Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: {Instructions}. ### Input: {Input} ### Response: """ alpeca_prompt.format( Instructions, Input) device = "cuda:0" inputs = tokenizer(alpeca_prompt, return_tensors="pt").to(device) outputs = model.generate(**inputs, max_new_tokens=20) print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Output ~
Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: Give the NAME who has the highest SALARY. ### Input: CREATE TABLE `sample` ( `NAME` text, `SALARY` int DEFAULT NULL, `STATE` text ) ### Response: SELECT `NAME` FROM `sample` ORDER BY `SALARY` DESC LIMIT 1
hope it helps 🏷️. If any doubt fell free to connect
Okay, THX SO MUCH
Hi there,Sorry to interrupt you again,I use the method that you gave to me yesterday.
Instructions = """Give the NAME who has the highest SALARY"""
Input = """CREATE TABLE `sample` (
`NAME` text,
`SALARY` int DEFAULT NULL,
`STATE` text
)"""
alpeca_prompt = f"""Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: {Instructions}. ### Input: {Input}
### Response:
"""
alpeca_prompt.format(
Instructions,
Input)
the overall example is worked but if I change the long ddl text about 8120 tokens, the response is change to
### Response:
below, VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR, 11111
if i change the part of outputs = model.generate(**inputs, max_new_tokens=20)
to outputs = model.generate(**inputs, max_new_tokens=200)
it will gave me more response like VARCHAR
, How 2 deal with this problem, Thank you for your response.
actually I have fine tuned the Gemma model for texts token size <=512 (because of limited resources). So this token size includes the complete input to the model including instruction & input. If you want to get accurate results for longer texts then you have to fine tune the model on such sample. It's just a demo model, should not be used for long input context.
This is the data I used Data (sample of less token length)
The actual source of this data is source this data contains longer input sequence, so if you want to get accurate results then feel free to fine tune on this whole data.
If you want help on that or any other query feel free to reach me.
actually I have fine tuned the Gemma model for texts token size <=512 (because of limited resources). So this token size includes the complete input to the model including instruction & input. If you want to get accurate results for longer texts then you have to fine tune the model on such sample. It's just a demo model, should not be used for long input context.
This is the data I used Data (sample of less token length)
The actual source of this data is source this data contains longer input sequence, so if you want to get accurate results then feel free to fine tune on this whole data.
If you want help on that or any other query feel free to reach me.
👌,Thank you for ur response!
Hi there,When I load the model by fastchat or ollama there was the unrecognizable characters back to me, here is the example
and the response of this prompt is
Is there something wrong when i load the model?