fix: Handle missing and unexpected keys during LLMEncoder state dict load, part 2

ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Apache License 2.0

11.21k stars 1.19k forks source link

Followup to #3841. When loading the state dict with fully-qualified parameter names, the post-load hook was unable to properly compare missing keys with local model parameter names. For example, when attempting to remove the key encoder.model.base_model.model.model.embed_tokens.weight from the missing keys list, the hook would only be have access to the parameter name model.base_model.model.model.embed_tokens.weight.

This adds a parameter name prefix computation to the hook and updates the load_state_dict unit tests to include a wrapper for validating that fully-qualified names are loaded correctly.

Unit Test Results

  6 files ±0   6 suites ±0 13m 53s :stopwatch: -11s 12 tests ±0   9 :heavy_check_mark: ±0   3 :zzz: ±0 0 :x: ±0 60 runs ±0 42 :heavy_check_mark: ±0 18 :zzz: ±0 0 :x: ±0

Results for commit 6345fae3. ± Comparison against base commit 8f9b5462.

:recycle: This comment has been updated with latest results.

ludwig-ai / ludwig

fix: Handle missing and unexpected keys during LLMEncoder state dict load, part 2 #3843

Unit Test Results