ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.22k stars 1.19k forks source link

Remove target_module hardcoding for Mixtral model #3853

Closed arnavgarg1 closed 1 month ago

arnavgarg1 commented 11 months ago

PEFT 0.7.1 doesn't have a default LoRA target module mapping: https://github.com/huggingface/peft/blob/8665e2b5719faa4e4b91749ddec09442927b53e0/src/peft/utils/constants.py#L49

For now, this PR (https://github.com/ludwig-ai/ludwig/pull/3852) adds a fallback mechanism to default to the q_proj and v_proj linear layer tensors when LoRA is used with either mixtral or mixtral instruct.

Once PEFT adds official support for Mixtral, we should remove this hack.