tunib-ai / parallelformers

Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
https://tunib-ai.github.io/parallelformers
Apache License 2.0
766 stars 61 forks source link

[Feature Request] Add Bloom to the Auto Policy #36

Open airsplay opened 1 year ago

airsplay commented 1 year ago

Add Bloom to the Auto Policy

It would be great to see the recent bloom model from bigscience can be added to the auto policy. The Bloom model is another auto-regressive large language model thus the policy might be inherited from existing policies.

Expected behavior

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("bigscience/bloom-2b5")
tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom-2b5")

from parallelformers import parallelize
parallelize(model, num_gpus=2, fp16=True, verbose='detail')

inputs = tokenizer("Parallelformers is", return_tensors="pt")

outputs = model.generate(
    **inputs,
    num_beams=5,
    no_repeat_ngram_size=4,
    max_length=15,
)

print(f"Output: {tokenizer.batch_decode(outputs)[0]}")
csinva commented 1 year ago

+1!!!

seopbo commented 1 year ago

This is great.