foundation-model-stack / fms-hf-tuning

🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.
Apache License 2.0
28 stars 48 forks source link

build(deps): Update accelerate requirement from <0.34,>=0.20.3 to >=0.20.3,<1.1 #366

Closed dependabot[bot] closed 1 month ago

dependabot[bot] commented 1 month ago

Updates the requirements on accelerate to permit the latest version.

Release notes

Sourced from accelerate's releases.

Accelerate 1.0.0 is here!

🚀 Accelerate 1.0 🚀

With accelerate 1.0, we are officially stating that the core parts of the API are now "stable" and ready for the future of what the world of distributed training and PyTorch has to handle. With these release notes, we will focus first on the major breaking changes to get your code fixed, followed by what is new specifically between 0.34.0 and 1.0.

To read more, check out our official blog here

Migration assistance

  • Passing in dispatch_batches, split_batches, even_batches, and use_seedable_sampler to the Accelerator() should now be handled by creating an accelerate.utils.DataLoaderConfiguration() and passing this to the Accelerator() instead (Accelerator(dataloader_config=DataLoaderConfiguration(...)))
  • Accelerator().use_fp16 and AcceleratorState().use_fp16 have been removed; this should be replaced by checking accelerator.mixed_precision == "fp16"
  • Accelerator().autocast() no longer accepts a cache_enabled argument. Instead, an AutocastKwargs() instance should be used which handles this flag (among others) passing it to the Accelerator (Accelerator(kwargs_handlers=[AutocastKwargs(cache_enabled=True)]))
  • accelerate.utils.is_tpu_available should be replaced with accelerate.utils.is_torch_xla_available
  • accelerate.utils.modeling.shard_checkpoint should be replaced with split_torch_state_dict_into_shards from the huggingface_hub library
  • accelerate.tqdm.tqdm() no longer accepts True/False as the first argument, and instead, main_process_only should be passed in as a named argument

Multiple Model DeepSpeed Support

After long request, we finally have multiple model DeepSpeed support in Accelerate! (though it is quite early still). Read the full tutorial here, however essentially:

When using multiple models, a DeepSpeed plugin should be created for each model (and as a result, a separate config). a few examples are below:

Knowledge distillation

(Where we train only one model, zero3, and another is used for inference, zero2)

from accelerate import Accelerator
from accelerate.utils import DeepSpeedPlugin

zero2_plugin = DeepSpeedPlugin(hf_ds_config="zero2_config.json") zero3_plugin = DeepSpeedPlugin(hf_ds_config="zero3_config.json")

deepspeed_plugins = {"student": zero2_plugin, "teacher": zero3_plugin}

accelerator = Accelerator(deepspeed_plugins=deepspeed_plugins)

To then select which plugin to be used at a certain time (aka when calling prepare), we call `accelerator.state.select_deepspeed_plugin("name"), where the first plugin is active by default:

accelerator.state.select_deepspeed_plugin("student")
student_model, optimizer, scheduler = ...
student_model, optimizer, scheduler, train_dataloader = accelerator.prepare(student_model, optimizer, scheduler, train_dataloader)

accelerator.state.select_deepspeed_plugin("teacher") # This will automatically enable zero init teacher_model = AutoModel.from_pretrained(...) teacher_model = accelerator.prepare(teacher_model)

... (truncated)

Commits
  • 5d71646 Fix merge
  • 97d413e Release: v1.0.0
  • 0c46240 Release: v1.0.0rc1
  • 873b623 Release: v1.0.0rc0
  • 127818f MNT Permission for PRs for GH token in stale.yml (#3112)
  • bcc13c0 typo of "scalar" instead of "scaler" (#3116)
  • d4d6b6e fix tip brackets typo (#3129)
  • 1077611 only move model to device when model is in cpu and target device is xpu (#3133)
  • cd93e35 🐛 [HotFix] Handle Profiler Activities Based on PyTorch Version (#3136)
  • e93b056 fix deprecated torch.cuda.amp.GradScaler FutureWarning for pytorch 2.4+ (#3...
  • Additional commits viewable in compare view


Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
github-actions[bot] commented 1 month ago

Thanks for making a pull request! 😃 One of the maintainers will review and advise on the next steps.

dependabot[bot] commented 1 month ago

Superseded by #372.