JohnSnowLabs / langtest

Deliver safe & effective language models
http://langtest.org/
Apache License 2.0
492 stars 36 forks source link

Explore ReLM for a framework to evaluate LLMs #687

Open dcecchini opened 1 year ago

dcecchini commented 1 year ago

Source: https://arxiv.org/abs/2211.15458

"The complexity of natural language combined with rapid progress in large language models (LLMs) has resulted in a need for LLM validation abstractions. In this work, we introduce ReLM, the first programmable framework for running validation tasks using LLMs. ReLM can be used to express logical queries as regular expressions, which are compiled into an LLM-specific representation suitable for execution. Over memorization, gender bias, toxicity, and language understanding tasks, ReLM is able to execute queries up to 15× faster, with 2.5× less data, or in a manner that enables additional insights. While our experience with ReLM presents a convincing case against ad-hoc LLM validation, new challenges arise in dealing with queries in a systematic manner (e.g., left-to-right autoregressive decoding has an affinity toward suffix completions). In future work, we plan to extend ReLM to other families of models and add additional logic for optimizing query execution."

dcecchini commented 11 months ago

Adding the Github repo: https://github.com/mkuchnik/relm