huggingface / text-generation-inference

Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
8.74k stars 1.01k forks source link

Regex response type is not respected #2318

Closed aymeric-roucher closed 1 week ago

aymeric-roucher commented 1 month ago

System Info

Using TGI through Inference Endpoints with this endpoint.

Reproduction

This is the example from the doc.

from huggingface_hub import InferenceClient

client = InferenceClient("https://o9blasawqn0vtw5b.us-east-1.aws.endpoints.huggingface.cloud")

regexp = "((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)"

resp = client.text_generation(
    f"What is Googles DNS? Please use the following regex: {regexp}",
    seed=42,
    grammar={
        "type": "regex",
        "value": regexp,
    },
)

print(resp)

I get output: 1.1.1.1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

Expected behavior

I would expect to get a matching regex, so the last number should be between 0 and 255 and not a long sequence of 1s.

aymeric-roucher commented 1 month ago

Using instead the models served on Inference API seem to work though:

from huggingface_hub import InferenceClient

client = InferenceClient("https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3.1-8B-Instruct")

regexp = "((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)"

resp = client.text_generation(
    f"What is Googles DNS? Please use the following regex: {regexp}",
    seed=42,
    grammar={
        "type": "regex",
        "value": regexp,
    },
)

print(resp)
ErikKaum commented 1 month ago

Thanks for reporting @aymeric-roucher 🙌

I wonder if it's technically respecting the grammar 🤔 my hypothesis:

I've seen similar behavior with e.g "\n", especially on 8B and smaller models.

What's weird is that you get a different result on the Inference endpoint and API.

I'll ping @drbh for this one as well 👍

drbh commented 1 week ago

Hey @aymeric-roucher thanks for pointing this out, I believe there was a couple issues with the regex expression I originally added to the docs. I think the \\d? notion may have caused subtle issues with the grammar compilation. A similar yet more simple and valid IP grammar would be (((25[0-5]|2[0-4]|[01])\.){3}(25[0-5]|2[0-4]|[01])) and that would match 1.1.1.1, the final 1 should only appear once. Apologies for any confusion!

I've just opened a PR to update the docs to use a different (easier to read and re use) regex expression here: https://github.com/huggingface/text-generation-inference/pull/2468

drbh commented 1 week ago

closing as https://github.com/huggingface/text-generation-inference/pull/2468 was merged and is available here https://huggingface.co/docs/text-generation-inference/en/basic_tutorials/using_guidance#constrain-with-pydantic