Guaranteeing valid output syntax fails

q-jackboylan commented 1 year ago

The bug When I run the Guaranteeing valid output syntax notebook, the model creates additional fields that weren't prompted. program._exception does not return any error.

To Reproduce I'm running this on a Debian GCP VM instance but I can also reproduce it on Google Colab

# !pip install guidance transformers accelerate bitsandbytes

import guidance
import transformers

model_id = "huggyllama/llama-7b"
model = transformers.AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True, device_map="auto")
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)

# we use LLaMA here, but any GPT-style model will do
guidance.llm = guidance.llms.Transformers(model=model, 
                                          tokenizer=tokenizer, 
                                          # device=0
                                          )

# we can pre-define valid option sets
valid_weapons = ["sword", "axe", "mace", "spear", "bow", "crossbow"]

# define the prompt
program = guidance("""The following is a character profile for an RPG game in JSON format.
```json
{
    "description": "{{description}}",
    "name": "{{gen 'name'}}",
    "age": {{gen 'age' pattern='[0-9]+' stop=','}},
    "armor": "{{#select 'armor'}}leather{{or}}chainmail{{or}}plate{{/select}}",
    "weapon": "{{select 'weapon' options=valid_weapons}}",
    "class": "{{gen 'class'}}",
    "mantra": "{{gen 'mantra'}}",
    "strength": {{gen 'strength' pattern='[0-9]+' stop=','}},
    "items": [{{#geneach 'items' num_iterations=3}}
        "{{gen 'this'}}",{{/geneach}}
    ]
}```""")

# execute the prompt
out = program(description="A quick and nimble fighter.", valid_weapons=valid_weapons)

System info (please complete the following information):

OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): Debian 11 bullseye
Guidance Version (guidance.__version__): guidance==0.0.64

CivilEngineerUK commented 1 year ago

I am also having this issue. The output has additional JSON items that are not requested and does not complete the actual fields in the provided JSON schema. I have tried with openai gpt-4 and openai text-davinci-003. No errors thrown.

I have used the following notebook: https://github.com/guidance-ai/guidance/blob/main/notebooks/guaranteeing_valid_syntax.ipynb

OS - Windows 11
Guidance Version (guidance.version): guidance==0.0.64

CorentinvdBdO commented 12 months ago

I solve this by using " stop=','" in every field.

However, I agree, all examples fail to work properly for me.

Wehzie commented 11 months ago

Thank you, same problem here with local Huggingface models.

PiotrCzapla commented 11 months ago

I've noticed that when stop is not set, gen tries to figure out the stop word like this:

next_text = getattr(next_node, "text", next_node) if next_node is not None else ""

unfortunately when next_node is ParseResults(['\\n```'], {}) the text property returns empty string ''

I'm not sure how to get the proper string out of the ParseResults yet but it seems the .text does not work anymore in the recent handlebar version.

guidance-ai / guidance

Guaranteeing valid output syntax fails #360