ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.11k stars 1.19k forks source link

Add batch size tuning for LLMs #3871

Closed Infernaught closed 8 months ago

Infernaught commented 8 months ago

This PR extends Ludwig's batch size tuning functionality to LLMs.

For each batch size, we generate synthetic data in the following way: We consider three values: (1) The sum of the max_sequence_lengths of the input feature and the output feature (2) The global_max_sequence_length (3) The model's context length

If (1) is the smallest, then we generate synthetic inputs and outputs with the corresponding max_sequence_lengths. If (2) is the smallest, then we generate synthetic inputs and outputs with length global_max_sequence_length/2 + 1. If (3) is the smallest, then we generate synthetic inputs and outputs with length context_len/2 + 1.

github-actions[bot] commented 8 months ago

Unit Test Results

  6 files  ±0    6 suites  ±0   14m 16s :stopwatch: -1s 12 tests ±0    9 :heavy_check_mark: ±0    3 :zzz: ±0  0 :x: ±0  60 runs  ±0  42 :heavy_check_mark: ±0  18 :zzz: ±0  0 :x: ±0 

Results for commit de19d8cf. ± Comparison against base commit 138cc4a8.

:recycle: This comment has been updated with latest results.