ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.21k stars 1.19k forks source link

fix: Handle missing and unexpected keys during LLMEncoder state dict load, part 2 #3843

Closed jeffkinnison closed 11 months ago

jeffkinnison commented 11 months ago

Followup to #3841. When loading the state dict with fully-qualified parameter names, the post-load hook was unable to properly compare missing keys with local model parameter names. For example, when attempting to remove the key encoder.model.base_model.model.model.embed_tokens.weight from the missing keys list, the hook would only be have access to the parameter name model.base_model.model.model.embed_tokens.weight.

This adds a parameter name prefix computation to the hook and updates the load_state_dict unit tests to include a wrapper for validating that fully-qualified names are loaded correctly.

github-actions[bot] commented 11 months ago

Unit Test Results

  6 files  ±0    6 suites  ±0   13m 53s :stopwatch: -11s 12 tests ±0    9 :heavy_check_mark: ±0    3 :zzz: ±0  0 :x: ±0  60 runs  ±0  42 :heavy_check_mark: ±0  18 :zzz: ±0  0 :x: ±0 

Results for commit 6345fae3. ± Comparison against base commit 8f9b5462.

:recycle: This comment has been updated with latest results.