TencentAILabHealthcare / scBERT

GNU General Public License v3.0
248 stars 55 forks source link

finetune.py: proble with local_rank #53

Open martina811 opened 2 months ago

martina811 commented 2 months ago

Hello everyone, I am facing a problem with the script finetune.py, when I launch the command from the tutorial _python -m torch.distributed.launch finetune.py --data_path DATA/Zheng68K.h5ad --model_path DATA/panglaopretrain.pth , I get the following error:

If your script expects --local-rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions

warnings.warn( usage: finetune.py [-h] [--local_rank LOCAL_RANK] [--bin_num BIN_NUM] [--gene_num GENE_NUM] [--epoch EPOCH] [--seed SEED] [--batch_size BATCH_SIZE] [--learning_rate LEARNING_RATE] [--grad_acc GRAD_ACC] [--valid_every VALID_EVERY] [--pos_embed POS_EMBED] [--data_path DATA_PATH] [--model_path MODEL_PATH] [--ckpt_dir CKPT_DIR] [--model_name MODEL_NAME] finetune.py: error: unrecognized arguments: --local-rank=0

Is there anybody else that faced this problem and fixed it?

Many thanks

Jiam1ng commented 1 month ago

I tried to add the --use_env argument when running the script, and the local-rank error could be solved. For your command, it should be: python -m torch.distributed.launch --use_env finetune.py --data_path DATA/Zheng68K.h5ad --model_path DATA/panglao_pretrain.pth

In my situation, --local_rank 0 should also be added to the last of the command to run the script.