Remove the yammering from LLM outputs.

notarealdeveloper commented 9 months ago

Hiya!

I'm trying to build some tools on top of @rskottap's gpts library, but I've noticed the LLMs it provides tend to yammer.

For example:

>>> import gpts
>>> model = gpts.Mistral()
>>> model("How many ounces are there in a pound?")
'There are 16 ounces in one pound. This relationship is often expressed as "1 pound equals 16 ounces." This conversion factor holds true for both dry and liquid measurements of weight or mass.'

I've tried stopping the yammering by asking directly.

>>> model("How many ounces are there in a pound? Don't yammer. Just give me the answer.")
'There are 16 ounces in one pound.'

By asking politely.

>>> model("How many ounces are there in a pound? For the love of god, stop yammering. Just give me the answer as a number. If your output isn't just a number, I'm going to scream.")
'There are 16 ounces in one pound.'

And by threatening the model with mass casualties.

>>> model("How many ounces are there in a pound? I'm going to pass your answer to the `int()` builtin in some python code, and if it raises an exception, then this plane is going to crash and people are going to die.")
'There are 16 ounces in one pound. You can convert between pounds and ounces by multiplying or dividing by 16. In your Python code, you should be able to represent this conversion as follows:\n\npounds = 2.5\nounces = pounds * 16\nint_ounces = int(ounces)\n\nIn this example, `pounds` has a value of 2.5 (representing 2 pounds and 12 ounces), which is then multiplied by 16 to get the equivalent weight in ounces. The result is then passed through the `int()` function for potential conversion to an integer, although it should not raise an exception since the value is already an integer before this step.'

Nothing works.

I searched github for the words "remove", "yammering" and "LLMs" and your library came up.

I was wondering if you could give me some guidance.

How might I use your library to add a .multiple_choice() method to an arbitrary model in the gpts library?

hudson-ai commented 9 months ago

Your particular use-case seems better suited to gen_int or gen_float, e.g.

from guidance import block
from minml import gen_float
model += "How many ounces are there in a pound?" # Assuming you have a fresh model obj from `guidance.models`
with block("answer"):
   model += gen_float()
assert int(model["answer"]) == 16

We don't yet support multiple choice types, but it would be a great and easy first PR if you're looking to contribute (you could implement gen_literal on typing.Literals).

If you're comfortable using the underlying guidance library that's used to implement most of the functionality supplied by minml, you could accomplish that with a call to select (the Literal implementation would be a very thin wrapper around this).

E.g.

from guidance import block, select
model += "How many ounces are there in a pound?" # Assuming you have a fresh model obj from `guidance.models`
with block("answer"):
   model += select([15, 16, 17])
assert int(model["answer"]) == 16

rskottap commented 9 months ago

^^^ would be super helpful for validating the llms output and making gpts library more automatable.

rskottap commented 9 months ago

@hudson-ai after doing pip install --upgrade minml I just tried your code above but am getting this error

>>> from guidance import block
>>> from minml import gen_float
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'gen_float' from 'minml'

hudson-ai commented 9 months ago

Ah, I haven't pushed to PyPI in a while. I will do so when I get the chance, but please clone and run make install in the meantime. Master should be stable/working at the time of writing this, but 0.0.2 should have the functionality you need.

notarealdeveloper commented 9 months ago

Cool, I think this issue is resolved.

Got this code into a PR to gpts.

Screenshot from 2024-02-12 16-44-53

Now we've got another issue for ya.

hudson-ai commented 9 months ago

@notarealdeveloper note that minml already takes care of that lookup table for you in a single integrated interface: gen_type.

E.g. gen_type(int) == gen_int(), gen_type(list[str]) == gen_list(str), etc. ;)

notarealdeveloper commented 9 months ago

@notarealdeveloper note that minml already takes care of that lookup table for you in a single integrated interface: gen_type.

E.g. gen_type(int) == gen_int(), gen_type(list[str]) == gen_list(str), etc. ;)

Ooh nice!

Will check that out tomorrow. :)

hudson-ai / minml

Remove the yammering from LLM outputs. #2