ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.22k stars 1.19k forks source link

Add LLM Text Encoder #3828

Closed jeffkinnison closed 11 months ago

jeffkinnison commented 11 months ago

Overview

This adds a text encoder for ECD that wraps a pretrained LLM and passes the model's hidden state downstream to the combiner. We reuse large portions of the LLM model type's utitilies, refactored as utility functions rather than LLM methods.

LLM Encoder

The encoder subclasses SequenceEncoder and implements custom behavior for working with LLMs, with pieces borrowed from LLM. Whereas the LLM model type is focused on text generation, the encoder is focused on 1) passing hidden state downstream for predictive tasks and 2) packaging the adapter with the rest of the ECD architecture. Major methods include

Refactored LLM Methods

The following methods have been moved from the LLM class into ludwig.utils.llm_utils:

github-actions[bot] commented 11 months ago

Unit Test Results

  6 files  ±0    6 suites  ±0   13m 56s :stopwatch: +2s 12 tests ±0    9 :heavy_check_mark: ±0    3 :zzz: ±0  0 :x: ±0  60 runs  ±0  42 :heavy_check_mark: ±0  18 :zzz: ±0  0 :x: ±0 

Results for commit 0c8d45cd. ± Comparison against base commit fe3f0390.

:recycle: This comment has been updated with latest results.