Add batch size tuning for LLMs

ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Apache License 2.0

11.11k stars 1.19k forks source link

This PR extends Ludwig's batch size tuning functionality to LLMs.

For each batch size, we generate synthetic data in the following way: We consider three values: (1) The sum of the max_sequence_lengths of the input feature and the output feature (2) The global_max_sequence_length (3) The model's context length

If (1) is the smallest, then we generate synthetic inputs and outputs with the corresponding max_sequence_lengths. If (2) is the smallest, then we generate synthetic inputs and outputs with length global_max_sequence_length/2 + 1. If (3) is the smallest, then we generate synthetic inputs and outputs with length context_len/2 + 1.

Unit Test Results

  6 files ±0   6 suites ±0 14m 16s :stopwatch: -1s 12 tests ±0   9 :heavy_check_mark: ±0   3 :zzz: ±0 0 :x: ±0 60 runs ±0 42 :heavy_check_mark: ±0 18 :zzz: ±0 0 :x: ±0

Results for commit de19d8cf. ± Comparison against base commit 138cc4a8.

:recycle: This comment has been updated with latest results.

ludwig-ai / ludwig

Add batch size tuning for LLMs #3871

Unit Test Results