Closed anhuong closed 7 months ago
asssigned to vassilis vassiliadis
sft_trainer.py uses transformers.HFArgumentParser
to parse its command-line arguments into dataclass objects. We wouldn't want to change that by using a different parser just for this command-line parameter.
Therefore, I think the most straightforward approach here is the following:
["q_proj", "v_proj"]
target_modules
must be a List[str]
. Otherwise --target_modules foo bar
ends up being ["foo"]
instead of ["foo", "bar"]
["all-linear"]
into "all-linear"
so that LORA receives a string instead of an array of strings.
--target_modules=all-linear
all-linear
is supported in peft 0.8.0+ so we should reflect that with a python package dependencyThe above will enable:
--target_modules=all-linear
train()
method with a target_modules=None
while retaining the current behaviour for providing a list of layers for lora to attach to.
Request
LoraConfig can accept a List or a str for the target_modules as seen in the description below. This would be useful in order to support passing
"all-linear"
as an option instead of the specific attention layers.Context
fms-hf-tuning
LoraConfig
currently accepts only a List:target_modules: List[str] = field(default_factory=lambda: ["q_proj", "v_proj"])
. This causes it so that if one tries to passall-linear
it is interpreted as a ListExample
I tried testing setting
target_modules: Union[List[str], str] = field(default_factory=lambda: ["q_proj", "v_proj"])
however "all-linear" was still interpreted as a List instead of a string. This is likely due to the command-line parsing.Note that
all-linear
is supported in PEFT 1.8.0 and must be upgraded.Acceptance criteria