Explore ReLM for a framework to evaluate LLMs

Source: https://arxiv.org/abs/2211.15458

"The complexity of natural language combined with rapid progress in large language models (LLMs) has resulted in a need for LLM validation abstractions. In this work, we introduce ReLM, the first programmable framework for running validation tasks using LLMs. ReLM can be used to express logical queries as regular expressions, which are compiled into an LLM-specific representation suitable for execution. Over memorization, gender bias, toxicity, and language understanding tasks, ReLM is able to execute queries up to 15× faster, with 2.5× less data, or in a manner that enables additional insights. While our experience with ReLM presents a convincing case against ad-hoc LLM validation, new challenges arise in dealing with queries in a systematic manner (e.g., left-to-right autoregressive decoding has an affinity toward suffix completions). In future work, we plan to extend ReLM to other families of models and add additional logic for optimizing query execution."

JohnSnowLabs / langtest

Explore ReLM for a framework to evaluate LLMs #687