Closed jarork closed 2 years ago
Just create a launch.json that looks like below.
{
"name": "train",
"type": "python",
"request": "launch",
"module": "accelerate.commands.launch",
"args": ["train.py"], // other args comes after train.py
"console": "integratedTerminal",
// "env": {"CUDA_LAUNCH_BLOCKING": "1"}
},
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hi @yuxinyuan , how can I specify other args after train.py. Could you give me an example?
Shoud it be like { ... "args": ["train.py"], ["--arg"] ... }
Hi @yuxinyuan , how can I specify other args after train.py. Could you give me an example?
Shoud it be like { ... "args": ["train.py"], ["--arg"] ... }
It should be sth like "args": ["train.py", "--flag1", "--arg1=hello"]
I'm trying to figure out how to debug processes pdsh kicks off on my second node (see #1114) with vs code or pdb, anything really. Anyone have any advice?
this is from the url: https://huggingface.co/docs/accelerate/usage_guides/megatron_lm, how specify the vscode launch.json file?
accelerate launch --config_file megatron_gpt_config.yaml \ examples/by_feature/megatron_lm_gpt_pretraining.py \ --config_name "gpt2-large" \ --tokenizer_name "gpt2-large" \ --dataset_name wikitext \ --dataset_config_name wikitext-2-raw-v1 \ --block_size 1024 \ --learning_rate 5e-5 \ --per_device_train_batch_size 24 \ --per_device_eval_batch_size 24 \ --num_train_epochs 5 \ --with_tracking \ --report_to "wandb" \ --output_dir "awesome_model"
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python: Current File",
"type": "python",
"request": "launch",
"module": "accelerate.commands.launch",
"args": [
"--config_file", "megatron_gpt_config.yaml",
"./examples/by_feature/megatron_lm_gpt_pretraining.py",
"--config_name ", "gpt2-large",
],
"console": "integratedTerminal",
"justMyCode": false
}
]
}
Hi, could you also explain how to specify arguments of accelerate launch, like --gpu_ids, please? (In other words, is it possible to configure a launch.json file representing CLI commands like accelerate launch --gpu_ids 1 main.py --batch_size 512 --epoch 1000
?)
I think for those args they'd come before main.py -- perhaps you can try the following and see if it works?
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python: Current File",
"type": "python",
"request": "launch",
"module": "accelerate.commands.launch",
"args": [
"--gpu_ids", "1",
"main.py",
"--batch_size", "512",
"--epoch", "1000"
],
"console": "integratedTerminal",
"justMyCode": false
}
]
}
Yes, it works. Thank you!
None of above launch.json
configuration properly works for me. It pretends to run, but I get unresponsive hanging w/o any terminal output nor gpu usage. Any possible reason for this (e.g. wrongly configured dataloader)?
Make sure to specify the GPUs and be careful with the file name path with respect to the .vscode folder. Something like below should work hopefully:
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python: Current File",
"type": "python",
"request": "launch",
"env": {"CUDA_VISIBLE_DEVICES":"0,1"},
"module": "accelerate.commands.launch",
"args": [
"--multi_gpu",
"--num_processes", "2",
"./PATH/main.py",
"--model_name_or_path", "HuggingFaceH4/zephyr-7b-beta",
"--seed", "42",
etc. etc. etc.
],
"console": "integratedTerminal",
"justMyCode": false
}
]
}
I think for those args they'd come before main.py -- perhaps you can try the following and see if it works?
{ // Use IntelliSense to learn about possible attributes. // Hover to view descriptions of existing attributes. // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387 "version": "0.2.0", "configurations": [ { "name": "Python: Current File", "type": "python", "request": "launch", "module": "accelerate.commands.launch", "args": [ "--gpu_ids", "1", "main.py", "--batch_size", "512", "--epoch", "1000" ], "console": "integratedTerminal", "justMyCode": false } ] }
I still encounter the ModuleNotFoundError, /home/user/miniconda3/envs/ziplora/bin/python: Error while finding module specification for 'accelerate.command.launch' (ModuleNotFoundError: No module named 'accelerate.command')
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Current File",
"type": "debugpy",
"request": "launch",
"env": {"CUDA_VISIBLE_DEVICES":"0"},
"module": "accelerate.command.launch",
"args": [
"--pretrained_model_name_or_path","CompVis/stable-diffusion-v1-4",
"--train_data_dir","assets/cat_statue",
"--placeholder_token","
This should be commands, not command
"module": "accelerate.commands.launch"
what if I have this in my original shellscript, --dataset_name=$DATASET_PATH \
, how should it be in the launch.js?
checkout my example above where you can see for instance "--seed", "42",
so you can similarly add "--dataset_name", "DATASET_PATH",
None of above
launch.json
configuration properly works for me. It pretends to run, but I get unresponsive hanging w/o any terminal output nor gpu usage. Any possible reason for this (e.g. wrongly configured dataloader)?
Same issue, when I try to debug accelerate, the code accelerator = Accelerator() hangs, any solution to this?
I am a new user of accelerate. How should I configure VSCode in order to debug a program with accelerate? (E.g. accelerate launch train.py)