Flexibility and consistency for few-shot prompting in query_for_json

jas-ho commented 11 months ago

This is about the following two related issues which @gavento might have ideas about

jas-ho commented 11 months ago

1. few-shot prompting sometimes needs more flexibility

in my use case, I have a judge model which judges whether a previous model output was informative and responds with the following structure

{
  "reasoning": "The model output was very vague",
  "informative": false
}

Now I want to provide few-shot examples and would like to have a utility for getting the format exactly right.

I imagine sth like

FEW_SHOT = f"""
model: {UNINFORMATIVE_OUTPUT}
judge: {format_example(TOut(reasoning="The model output was very vague", informative=False)}

model: {INFORMATIVE_OUTPUT}
judge: {format_example(TOut(reasoning="The model output was extremely detailed", informative=True)}
"""

Does interlab contain sth like this?

jas-ho commented 11 months ago

edit: sorry, I think misread the code. If I understand correctly, the issue below does not actually exist:

2. inconsistency in formatting examples:

If I understand the code in query_for_json correctly, there is a formatting inconsistency in the following lines:

    if with_example is True:
        with_example = generate_json_example(schema, engine=example_engine)
    if with_example and not isinstance(with_example, str):
        with_example = json.dumps(with_example)

if with_example is True we get a json dump delimited by ```json ... ```
if with_example is an example instance (object of type TOut) we only get the json dump but not the triple backtick delimitation

jas-ho commented 11 months ago

3. bug if an example instance is provided.

The following test fails for the second subtest with TypeError: Object of type Foo is not JSON serializable coming from the with_example = json.dumps(with_example) line above:

@pytest.mark.parametrize("with_example", [False, Foo(z=False, x=33, y=["a", "b"])])
def test_query_for_json(with_example):
    def eng(q):
        if "TEST_A" in q:
            return "heh {'z': \"0\"} zzz"
        if "TEST_B" in q:
            return "```{'z': \"0\", 'x': 33}``` {'z': 1}"
        if "TEST_C" in q:
            assert re.search(FOO_SCHEMA_RE, q, re.MULTILINE)
            return "{'x':3}"
        raise Exception()

    query_for_json = partial(json_querying.query_for_json, with_example=with_example)

    assert query_for_json(eng, Foo, "TEST_A") == Foo(z=False)
    assert query_for_json(eng, Foo, "{absent} TEST_B {FORMAT_PROMPT} zzz") == Foo(z=False, x=33)
    with pytest.raises(ValueError):
        query_for_json(eng, Foo, "{FORMAT_PROMPT} TEST_A {FORMAT_PROMPT}")
    assert query_for_json(eng, Foo, "z {FORMAT_PROMPT}TEST_C") == Foo(x=3)
    assert query_for_json(eng, Foo, "TEST_C") == Foo(x=3)

jas-ho commented 11 months ago

above is solved by

from fastapi.encoders import jsonable_encoder
...
with_example = json.dumps(jsonable_encoder(with_example))

gavento commented 11 months ago

Thanks! Re your points:

Not sure how much is this a problem - are you asking for a standard example formatter? How should it look like according to you? (Or what best practices are there?)
Agree, but improved while addressing 3.
Great point, thanks for showing me jsonable_encoder. Improved by #27

gavento commented 11 months ago

(Not closing yet to allow discussion of point 1. by @jas-ho )

gavento commented 11 months ago

@jas-ho Can you please elaborate on your first point? (few-shot example formatting)

jas-ho commented 11 months ago

@gavento I think it's actually fine as is (so feel free to close if you agree):

the formatting of individual samples is trivial (I'm using the one-liner below)
the best way to compose these into few-shot samples is too application-specific to hard-code it (at least I wouldn't want other people to rely on my choices at my current state of knowledge

    def format_as_json_example(obj):
        return f"```json\n{json.dumps(jsonable_encoder(obj))}\n```"

gavento commented 11 months ago

Sounds good, thanks.

acsresearch / interlab