ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
10.97k stars 1.18k forks source link

[WIP] Gradual Unfreezing to mitigate catastrophic forgetting #3967

Open ethanreidel opened 3 months ago

ethanreidel commented 3 months ago

Adds the ability to gradually unfreeze or thaw specific layers within a pre-trained model's architecture. Aims to mitigate catastrophic forgetting/improve transfer learning capabilities. Currently works for ECD architecture.

User passes in two things: thaw_epochs (list of integers) and layers_to_thaw (2D array of layer strings)

thaw_epochs: -1 -2 layers_to_thaw:

TODO/potential issues:

test: [tests/ludwig/modules/test_gradual_unfreezing.py]

Any and all feedback is greatly appreciated. πŸ‘

github-actions[bot] commented 3 months ago

Unit Test Results

βŸβ€„β€ˆβŸβ€„βŸβ€„6 files  Β±βŸβ€„β€ˆβŸβ€„βŸβ€„0β€‚β€ƒβŸβ€„β€ˆβŸβ€„βŸβ€„6 suites  Β±0   52m 7s :stopwatch: + 22m 2s 2β€ˆ990 tests  -β€ŠβŸβ€„β€ˆβŸβ€„βŸβ€„3  2β€ˆ966 :heavy_check_mark:  -β€ŠβŸβ€„β€ˆβŸβ€„15  23 :zzz: +11  1 :x: +1  8β€ˆ970 runs  +5β€ˆ941  8β€ˆ898 :heavy_check_mark: +5β€ˆ893  69 :zzz: +45  3 :x: +3 

For more details on these failures, see this check.

Results for commit d2ba5cb0. ± Comparison against base commit 606c732a.

ethanreidel commented 3 months ago

@skanjila @saad-palapa