bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
https://petals.dev
MIT License
8.89k stars 489 forks source link

IndexError: tuple index out of range #532

Open jaskirat8 opened 8 months ago

jaskirat8 commented 8 months ago

To get bootstrapped, I tried to use the example from Readme

from transformers import AutoTokenizer
from petals import AutoDistributedModelForCausalLM
import torch 

# Choose any model available at https://health.petals.dev
model_name = "petals-team/StableBeluga2"  # This one is fine-tuned Llama 2 (70B)

# Connect to a distributed network hosting model layers
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoDistributedModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float32)

# Run the model as if it were on your computer
inputs = tokenizer("A cat sat", return_tensors="pt")["input_ids"]
outputs = model.generate(input_ids=inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))  # A cat sat on a mat...

The torch_dtype=torch.float32 was added due to CPU support warning but apart from that rest is the same as the original example yet I am facing the error and unable to complete the inference.

Screenshot from 2023-10-25 10-30-08

OS : Ubuntu 22.04 CPU : i7-7700K GPU: Nvidia 1070

Please guide if i am missing something here.

AIAnytime commented 8 months ago

Same error... this doesn't work anymore. Creators are also not active.

daspartho commented 8 months ago

+1

running into the same error while trying to generate by calling model.generate() method in the getting started colab notebook.

Screenshot 2023-11-05 at 1 11 24 PM
daspartho commented 8 months ago

found relevant issue huggingface/transformers#10160

daspartho commented 8 months ago

this seems to be an issue with the petals library itself instead of the transformers library since replacing AutoDistributedModelForCausalLM with AutoModelForCausalLM seems to work fine

jaskirat8 commented 8 months ago

@daspartho, I have the same thoughts as I have been using the same models for months directly, and it works. I just wanted to validate that it is not something due to my misconfiguration or overlooking some setting; we need to isolate the culprit and work towards PR since other folks are also facing this.

daspartho commented 8 months ago

yes i agree!

also gently pinging @borzunov here

justheuristic commented 8 months ago

[working on it]