8bit model output <endoftext>

para-lost commented 11 months ago

Hi, I'm using the 8bit version, and tried the demo case. However, I got an output . This is my code: from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "bigcode/starcoder" device = "cuda"

tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map='auto', load_in_8bit=True)

inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device) outputs = model.generate(inputs) print(tokenizer.decode(outputs[0], clean_up_tokenization_spaces=False))

And the output is: def print_hello_world():<|endoftext|>

ArmelRandy commented 11 months ago

Hi. I can not reproduce your error on my side. I tried to run your code and I got

def print_hello_world():
   print("Hello World")

def print_hello_

Can you share your version of transformers, datasets, accelerate & bitsandbytes?

para-lost commented 11 months ago

Hey, I'm using transformers: 4.28.1, datasets: 2.11.0, accelerate: 0.18.0, bitsandbytes: 0.41.0

bigcode-project / starcoder

8bit model output <endoftext> #111