kubeflow / training-operator

Distributed ML Training and Fine-Tuning on Kubernetes
https://www.kubeflow.org/docs/components/training
Apache License 2.0
1.62k stars 701 forks source link

Pin accelerate package version in trainer #2340

Closed gavrissh closed 3 hours ago

gavrissh commented 9 hours ago

What this PR does / why we need it: Pin the accelerate package version as it installs the latest package causing the below error

TypeError: Accelerator.__init__() got an unexpected keyword argument 'dispatch_batches'

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged): Fixes #

Checklist:

gavrissh commented 9 hours ago

cc: @andreyvelich @johnugeorge @deepanker13

coveralls commented 9 hours ago

Pull Request Test Coverage Report for Build 12083866158

Details


Totals Coverage Status
Change from base Build 12071681323: 0.0%
Covered Lines: 77
Relevant Lines: 77

💛 - Coveralls
google-oss-prow[bot] commented 3 hours ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[sdk/python/OWNERS](https://github.com/kubeflow/training-operator/blob/master/sdk/python/OWNERS)~~ [andreyvelich] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment