Open rasbt opened 6 months ago
Sure, I can do this. Hopefully it's not too complicated 🤞
So, there are 3 models available:
As can be seen from the table, 2b
and 7b
models are mostly for code completion, they require a special prompt in format:
prompt = '''\
<|fim_prefix|>import datetime
def calculate_age(birth_year):
"""Calculates a person's age based on their birth year."""
current_year = datetime.date.today().year
<|fim_suffix|> <-- (Note): this is where a cursor should be in IDE
return age<|fim_middle|>\
'''
which is somewhat tricky to implement with the current LitGPT code. And frankly speaking I don't see why do we need it, since the output is so-so quality. If I copy-paste the codeblock from model page:
from transformers import GemmaTokenizer, AutoModelForCausalLM
model_id = "google/codegemma-2b"
tokenizer = GemmaTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
prompt = '''\
<|fim_prefix|>import datetime
def calculate_age(birth_year):
"""Calculates a person's age based on their birth year."""
current_year = datetime.date.today().year
<|fim_suffix|>
return age<|fim_middle|>\
'''
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
prompt_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0][prompt_len:]))
the output is:
age = current_year - birth_year<|file_separator|><eos>
or with another codeblock:
from transformers import GemmaTokenizer, AutoModelForCausalLM
tokenizer = GemmaTokenizer.from_pretrained("google/codegemma-2b")
model = AutoModelForCausalLM.from_pretrained("google/codegemma-2b")
input_text = "Write me a Python function to calculate the nth fibonacci number."
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))
the output is:
<bos>Write me a Python function to calculate the nth fibonacci number.
The Fibonacci numbers are the numbers in the following integer sequence.
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ……..
In mathematical terms, the sequence Fn of Fibonacci numbers is defined by the recurrence relation
Fn = Fn-1 + Fn-2
with seed values
F0 = 0 and F1 = 1.
The first ten terms are
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ……..
<strong>Example</strong>
Input: n = 10
Output: 55
<strong>Input:</strong> n = 20
<strong>Output:</strong> 6765
<strong>Input:</strong> n = 30
<strong>Output:</strong> 832040
On the other hand, the 7b-it
model is a much better proposition: it can chat and follow instructions.
⚡ codegemma ~/lit-gpt litgpt chat --checkpoint_dir checkpoints/$repo_id
Now chatting with CodeGemma-7b-it.
To exit, press 'Enter' on an empty prompt.
Seed set to 1234
>> Prompt: Hello
>> Reply: Hello! 👋 It's nice to hear from you. What would you like to talk about today? 😊
Time for inference: 1.70 sec total, 12.92 tokens/sec, 22 tokens
>> Prompt: Tell me a joke
>> Reply: Why did the bicycle fall over?
Because it was two tired!
Time for inference: 0.77 sec total, 18.08 tokens/sec, 14 tokens
>> Prompt: Write a softmax function in Python
>> Reply: ```python
import numpy as np
def softmax(x):
"""Compute softmax function for a given input array.
Args:
x: A numpy array.
Returns:
A numpy array of the same shape as x, containing the softmax values.
"""
# Calculate the exponential of each element in x
exps = np.exp(x - np.max(x, axis=1, keepdims=True))
# Sum the exponentials of all elements in the same row
sum_exps = np.sum(exps, axis=1, keepdims=True)
# Divide the exponentials by the sum of exponentials in the same row
softmax_values = exps / sum_exps
return softmax_values
Time for inference: 8.96 sec total, 19.30 tokens/sec, 173 tokens
My recommendation to stick to 7b-it
model. What do you think @rasbt?
There's CodeGemma now, which may be a good alternative to CodeLlama, which doesn't work so well. (In case you ever feel bored @Andrei-Aksionov :D)
All the various versions as listed here: https://huggingface.co/collections/google/codegemma-release-66152ac7b683e2667abdee11