cmnfriend / O-LoRA

MIT License
149 stars 18 forks source link

请问支持QLora吗? #3

Open DumoeDss opened 1 year ago

DumoeDss commented 1 year ago

如题,支持QLora吗?

cmnfriend commented 1 year ago

我们还没有基于Q-LoRA做过实验

DumoeDss commented 1 year ago

我们还没有基于Q-LoRA做过实验

有做支持的计划或打算嘛~感觉O-Lora很实用,会像QLora那样成为标配。

cmnfriend commented 1 year ago

我们还没有基于Q-LoRA做过实验

有做支持的计划或打算嘛~感觉O-Lora很实用,会像QLora那样成为标配。

之后确实可以研究一下~☺

DumoeDss commented 1 year ago

照着peft中的lora尝试修改新版的peft的lora,遇到了下面的错误

Traceback (most recent call last): File "/data/miniconda3/envs/axolotl/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/data/miniconda3/envs/axolotl/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/data/repos/axolotl/src/axolotl/cli/train.py", line 38, in fire.Fire(do_cli) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, kwargs) File "/data/repos/axolotl/src/axolotl/cli/train.py", line 34, in do_cli train(cfg=parsed_cfg, cli_args=parsed_cli_args, dataset_meta=dataset_meta) File "/data/repos/axolotl/src/axolotl/train.py", line 124, in train trainer.train(resume_from_checkpoint=resume_from_checkpoint) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1591, in train return inner_training_loop( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1729, in _inner_training_loop model, self.optimizer, self.lr_scheduler = self.accelerator.prepare( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare result = self._prepare_deepspeed(*args) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/accelerator.py", line 1662, in _preparedeepspeed engine, optimizer, , lr_scheduler = deepspeed.initialize(*kwargs) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/init.py", line 171, in initialize engine = DeepSpeedEngine(args=args, File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 304, in init self._configure_optimizer(optimizer, model_parameters) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1212, in _configure_optimizer self.optimizer = self._configure_zero_optimizer(basic_optimizer) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1473, in _configure_zero_optimizer optimizer = DeepSpeedZeroOptimizer( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 484, in init self.initialize_gradient_partitioning_data_structures() File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 709, in initialize_gradient_partitioning_data_structures self.first_param_index_in_partition[i][partition_id] = self.get_first_param_index( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 684, in get_first_param_index if partition_id in self.param_to_partition_ids[group_id][param_id]: KeyError: 0 Traceback (most recent call last): File "/data/miniconda3/envs/axolotl/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/data/miniconda3/envs/axolotl/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/data/repos/axolotl/src/axolotl/cli/train.py", line 38, in fire.Fire(do_cli) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(varargs, kwargs) File "/data/repos/axolotl/src/axolotl/cli/train.py", line 34, in do_cli train(cfg=parsed_cfg, cli_args=parsed_cli_args, dataset_meta=dataset_meta) File "/data/repos/axolotl/src/axolotl/train.py", line 124, in train trainer.train(resume_from_checkpoint=resume_from_checkpoint) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1591, in train return inner_training_loop( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1729, in _inner_training_loop model, self.optimizer, self.lr_scheduler = self.accelerator.prepare( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare result = self._prepare_deepspeed(*args) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/accelerator.py", line 1662, in _preparedeepspeed engine, optimizer, , lr_scheduler = deepspeed.initialize(**kwargs) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/init.py", line 171, in initialize engine = DeepSpeedEngine(args=args, File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 304, in init self._configure_optimizer(optimizer, model_parameters) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1212, in _configure_optimizer self.optimizer = self._configure_zero_optimizer(basic_optimizer) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1473, in _configure_zero_optimizer optimizer = DeepSpeedZeroOptimizer( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 484, in init self.initialize_gradient_partitioning_data_structures() File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 709, in initialize_gradient_partitioning_data_structures self.first_param_index_in_partition[i][partition_id] = self.get_first_param_index( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 684, in get_first_param_index if partition_id in self.param_to_partition_ids[group_id][param_id]: KeyError: 0 ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 70298) of binary: /data/miniconda3/envs/axolotl/bin/python Traceback (most recent call last): File "/data/miniconda3/envs/axolotl/bin/accelerate", line 8, in sys.exit(main()) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main args.func(args) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/commands/launch.py", line 977, in launch_command multi_gpu_launcher(args) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/commands/launch.py", line 646, in multi_gpu_launcher distrib_run.run(args) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

按照下面的修改之后错误消失,请问是否有影响?

// self.lora_A[adapter_name] = nn.Linear(self.in_features, r_sum, bias=False) # modified // self.lora_B[adapter_name] = nn.Linear(r_sum, self.out_features, bias=False) # modified
self.lora_A[adapter_name] = nn.Linear(self.in_features, r_sum if r_sum>0 else r, bias=False) # modified self.lora_B[adapter_name] = nn.Linear(r_sum if r_sum>0 else r, self.out_features, bias=False) # modified