hyperonym / basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
MIT License
1.29k stars 81 forks source link

crash when running mosaicml/mpt-7b-* models: KeyError: 'attention_mask' #213

Open tarasglek opened 1 year ago

tarasglek commented 1 year ago
from basaran.model import load_model

model = load_model('mosaicml/mpt-7b-storywriter',  trust_remote_code=True, load_in_8bit=True,)

for choice in model("once upon a time"):
    print(choice)
Traceback (most recent call last):
  File "/home/taras/Documents/ctranslate2/basaran/run.py", line 7, in <module>
    for choice in model("once upon a time"):
  File "/home/taras/Documents/ctranslate2/basaran/.venv/lib/python3.9/site-packages/basaran/model.py", line 73, in __call__
    for (
  File "/home/taras/Documents/ctranslate2/basaran/.venv/lib/python3.9/site-packages/basaran/model.py", line 233, in generate
    inputs = self.model.prepare_inputs_for_generation(
  File "/home/taras/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-storywriter/8667424ea9d973d3c01596fcbb86a3a8bc164299/modeling_mpt.py", line 280, in prepare_inputs_for_generation
    attention_mask = kwargs['attention_mask'].bool()
KeyError: 'attention_mask'
tarasglek commented 1 year ago

Same thing happens with mosaicml/mpt-7b-instruct

fardeon commented 1 year ago

It appears that the error originates from the internal code of MPT. We will conduct further testing.