Closed HarikrishnanBalagopal closed 1 month ago
Thanks for making a pull request! 😃 One of the maintainers will review and advise on the next steps.
@HarikrishnanBalagopal You created two PRs with slightly different changes, should we look at this PR or PR #364 ? One references model_args.output_dir
while the other references training_args.output_dir
Also if you have done any testing with multi-GPU, please let us know as unit tests are running on CPU
@anhuong Please review https://github.com/foundation-model-stack/fms-hf-tuning/pull/364 Thank you.
@HarikrishnanBalagopal You created two PRs with slightly different changes, should we look at this PR or PR #364 ? One references
model_args.output_dir
while the other referencestraining_args.output_dir
@anhuong This is the change required for the wca
branch. PTAL at #364 for the required change in the main
branch.
Also if you have done any testing with multi-GPU, please let us know as unit tests are running on CPU
Yes I have tested the same command with 1, 4 and 8 GPUs multiple times to ensure that the race condition doesn't occur.
Each process will try to create the output_dir
and ignore if it exists already.
Description of the change
Related issue number
How to verify the PR
Was the PR tested