guidance-ai / guidance

A guidance language for controlling large language models.
MIT License
18.81k stars 1.04k forks source link

How to truncate? #849

Open yileitu opened 4 months ago

yileitu commented 4 months ago

When the input exceeds the model's maximum context window size (here I use llama2 and thus 4096 tokens), an exception is raised. Here's the traceback for reference:

Traceback (most recent call last):
  File "/cluster/project/sachan/yilei/anaconda3/envs/subnet/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/cluster/project/sachan/yilei/anaconda3/envs/subnet/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/cluster/project/sachan/yilei/projects/lang_specific_neurons/NER/ner_act.py", line 89, in <module>
    guidance_model += f"{data[i]['tokens']} --- " + gen(name="ner_output", stop=["\n"])
  File "/cluster/project/sachan/yilei/anaconda3/envs/subnet/lib/python3.10/site-packages/guidance/models/_model.py", line 1159, in __add__
    out = lm._run_stateless(value)
  File "/cluster/project/sachan/yilei/anaconda3/envs/subnet/lib/python3.10/site-packages/guidance/models/_model.py", line 1364, in _run_stateless
    for chunk in gen_obj:
  File "/cluster/project/sachan/yilei/anaconda3/envs/subnet/lib/python3.10/site-packages/guidance/models/_model.py", line 760, in __call__
    logits = self.get_logits(token_ids, forced_bytes, current_temp)
  File "/cluster/project/sachan/yilei/anaconda3/envs/subnet/lib/python3.10/site-packages/guidance/models/transformers/_transformers.py", line 226, in get_logits
    raise Exception(
Exception: Attempted to run a transformers model past its maximum context window size of 4096!

I understand that this error occurs due to the input length exceeding the model's max_length. Is there a way to truncate the input automatically if it exceeds the maximum length? What parameter should I set?