epfl-dlab / transformers-CFG

šŸ¤— A specialized library for integrating context-free grammars (CFG) in EBNF with the Hugging Face Transformers
http://saibo-creator.xyz:7860/
MIT License
83 stars 15 forks source link

šŸ¤— Transformers CFG

Python 3.8+ License

šŸ’­ Latest News

We are thrilled to announce that transformers-cfg has been integrated into the Text-Generation-WebUI project, enabling users to utilize our CFG capabilities within this popular web interface for text generation. For more details, see the relevant pull request.

šŸš€ Introduction

transformers-cfg is an extension library for the popular Transformers library by Hugging Face, tailored for working with context-free grammars (CFG). This package provides additional tools and functionalities to enhance your experience with natural language processing tasks involving CFGs.

Initially developed as a pull request to the Hugging Face Transformers library, you can find the relevant discussion here.

šŸ’» Installation

šŸ”§ Quick Start: Force LLM to Generate a Valid JSON Object

Command-Line Interface

transformers-cfg-cli is a command-line tool that allows you to generate text using a model and a grammar. You can specify the model, grammar, prompts, and other parameters to generate text that conforms to the specified grammar.

transformers-cfg-cli generate \
    -m "microsoft/Phi-3-mini-4k-instruct" \
    -g "examples/grammars/json.ebnf" \
    -p "This is a valid json string for http request:" \
    --use_4bit \
    --max_new_tokens 60 \
    --repetition_penalty 1.1
# {"name":"John","age":30,"car":null}

We support also Unicode characters in the grammar:

transformers-cfg-cli generate \
    -m "microsoft/Phi-3-mini-4k-instruct" \
    -g "examples/grammars/chinese.ebnf" \
    -p "Translate the following sentence into Chinese: My neighbor is a very nice person. -> " \
    --use_4bit \
    --max_new_tokens 60 \
    --repetition_penalty 1.1

transformers-cfg-cli generate --help provides a list of available options and arguments.

Click here to see an example of generating a JSON object with minimal changes to HF code, or check it out in examples/generate_json.py ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer from transformers_cfg.grammar_utils import IncrementalGrammarConstraint from transformers_cfg.generation.logits_process import GrammarConstrainedLogitsProcessor if __name__ == "__main__": # Detect if GPU is available, otherwise use CPU device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print(f"Using device: {device}") model_id = "mistralai/Mistral-7B-v0.1" # Load model and tokenizer tokenizer = AutoTokenizer.from_pretrained(model_id) tokenizer.pad_token = tokenizer.eos_token model = AutoModelForCausalLM.from_pretrained(model_id).to(device) model.generation_config.pad_token_id = model.generation_config.eos_token_id # Load JSON grammar with open("examples/grammars/json.ebnf", "r") as file: grammar_str = file.read() grammar = IncrementalGrammarConstraint(grammar_str, "root", tokenizer) grammar_processor = GrammarConstrainedLogitsProcessor(grammar) # Generate prompts = ["This is a valid json string for http request:", "This is a valid json string for shopping cart:"] input_ids = tokenizer(prompts, add_special_tokens=False, return_tensors="pt", padding=True)["input_ids"] output = model.generate( input_ids, max_length=50, logits_processor=[grammar_processor], repetition_penalty=1.1, num_return_sequences=1, ) # Decode output generations = tokenizer.batch_decode(output, skip_special_tokens=True) print(generations) """ 'This is a valid json string for http request:{ "request": { "method": "GET", "headers": [], "content": "Content","type": "application" }}' 'This is a valid json string for shopping cart:{ "name": "MyCart", "price": 0, "value": 1 }' """ ```
Click here to see an example with HF pipeline API, or check it out in examples/pipeline_json.py ```python from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline from transformers_cfg.grammar_utils import IncrementalGrammarConstraint from transformers_cfg.generation.logits_process import GrammarConstrainedLogitsProcessor # Load model and tokenizer tokenizer = AutoTokenizer.from_pretrained(model_id) tokenizer.pad_token = tokenizer.eos_token model = AutoModelForCausalLM.from_pretrained(model_id).to(device) # Load grammar with open(f"examples/grammars/json.ebnf", "r") as file: grammar_str = file.read() grammar = IncrementalGrammarConstraint(grammar_str, "root", tokenizer) grammar_processor = GrammarConstrainedLogitsProcessor(grammar) # Initialize pipeline pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, device_map="auto", max_length=50, batch_size=2, ) generations = pipe( [ "This is a valid json string for http request: ", "This is a valid json string for shopping cart: ", ], do_sample=False, logits_processor=[grammar_processor], ) ```

šŸ’” Why Should I Use transformers-cfg?

šŸ¤” What Is a Grammar?

TL;DR: Think of it as an enhanced version of regular expressions.

Here is a simple example of a JSON grammar: ```bnf # A JSON object is the root of the grammar root ::= object # An object starts with "{" and ends with "}" and contains pairs separated by "," object ::= "{" pair ("," pair)* "}" # A pair is a string followed by a ":" and a value pair ::= string ":" value # A string is a sequence of alphanumeric characters enclosed in double quotes string ::= '"' [a-zA-Z0-9]* '"' # A value can be a string, another object, or a boolean value value ::= string | object | "true" | "false" | "null" ``` This grammar describes the structure of a JSON object. It specifies that a JSON object consists of key-value pairs, where the key is a string, and the value can be a string, another object, or a boolean value. You can use grammars to describe simple but useful constructs, such as valid email addresses, URLs, or phone numbers: ``` phone_number ::= "+" [0-9]+ ```

For advanced grammar debugging, check out our debugging guide.

Automatic JSON Schema Grammar Conversion[Experimental]

Learn how to automatically create custom grammars for complex JSON objects in our documentation on JSON schema to grammar conversion.

Grammar Collection

We provide a collection of grammars in the examples/grammars folder, which are mostly identical to the grammars in the llama-cpp project. We try to keep these grammars up-to-date with the original project, though we cannot yet guarantee that all grammars from llama-cpp can be directly used in transformers-cfg.

Available grammars include:

Supported Models

See supported_models.yaml for the full list of supported models.

As a rule of thumb, all models with the same tokenizer should be naturally supported.

If you find any model that is not supported, please open an issue or submit a pull request.

Citation

Please consider citing our work if you find the provided resources useful:

@inproceedings{geng-etal-2023-grammar,
    title        = {Grammar-Constrained Decoding for Structured {NLP} Tasks without Finetuning},
    author       = {Geng, Saibo  and Josifoski, Martin  and Peyrard, Maxime  and West, Robert},
    year         = 2023,
    month        = dec,
    booktitle    = {Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing},
    publisher    = {Association for Computational Linguistics},
    address      = {Singapore},
    url          = {https://aclanthology.org/2023.emnlp-main.674},
    editor       = {Bouamor, Houda  and Pino, Juan  and Bali, Kalika}
}

License

This project is licensed under the MIT License.

Acknowledgements

This project is derived from the torch-grammars project, which was itself derived from the llama-cpp project.