A common pattern is to allow repetitions of a pattern x up to N times.
While semantically correct, the syntax x? x? x?.... x? (with N repetitions) will result in extremely slow inference. Instead, you can write (x (x (x ... (x)?...)?)?)? (w/ N-deep nesting)
For example this will generate:
item item item item? item? item?
fn build_item_frequency(min: u16, max: u16) -> Vec<ProductionItem> {
let mut items = vec![];
let optional_count = cmp::max(max, min) - min;
for _ in 0..min {
items.push(ProductionItem::NonTerminal(
NonTerminalSymbol {
name: "item".to_string(),
},
RepetitionType::One,
))
}
for _ in 0..optional_count {
items.push(ProductionItem::NonTerminal(
NonTerminalSymbol {
name: "item".to_string(),
},
RepetitionType::ZeroOrOne,
))
}
items
}
Where as this will generate:
item item item (item (item (item)?)?)?
fn build_item_frequency(min: u16, max: u16) -> String {
let mut item_rule = String::from("");
let optional_count = cmp::max(max, min) - min;
for _ in 0..min {
item_rule.push_str("item");
}
for _ in 0..optional_count {
item_rule.push_str("(item");
}
for _ in 0..optional_count {
item_rule.push_str(")?");
}
item_rule
}
I've tested them both out, and the former is very slowly, and in some cases won't run complete at all. While the later doesn't seem to slow down at all. I don't know the status of this project, but it seems like it would be simple enough to implement. If you're not interested in handling it, I could take a look at it. But I figure handling this type of behavior might be more complex than it sounds and require fully understanding the engine you've built, and I'm a bit busy rn!
Anyways, looking forward to continuing using this project!
thanks for the comments! I haven't taken a look at this project in a bit, but I can think about some of the things you asked and/or open to PR request!
First of all, great project and very useful. It has a lot of promise and probably should be integrated with mistral-rs and other rust llm projects.
I noticed that it doesn't follow llama.cpp's method for nesting optional repetitions: https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md
For example this will generate:
item item item item? item? item?
Where as this will generate:
item item item (item (item (item)?)?)?
I've tested them both out, and the former is very slowly, and in some cases won't run complete at all. While the later doesn't seem to slow down at all. I don't know the status of this project, but it seems like it would be simple enough to implement. If you're not interested in handling it, I could take a look at it. But I figure handling this type of behavior might be more complex than it sounds and require fully understanding the engine you've built, and I'm a bit busy rn!
Anyways, looking forward to continuing using this project!