How to train my model in mutil GPU?

mit-han-lab / efficientvit

EfficientViT is a new family of vision models for efficient high-resolution vision.

Apache License 2.0

1.62k stars 143 forks source link

How to train my model in mutil GPU? #11

Closed yjtlab closed 9 months ago

yjtlab commented 11 months ago

I wanna to train this project in mutil Gpu ,however, when I use this instruction:


python train_cls_model.py configs/cls/imagenet/b1.yaml \
    --data_provider.image_size "[128,160,192,224,256,288]" \
    --run_config.eval_image_size "[288]" \
    --path exp/cls/imagenet/b1_r288/ ```
It turns that wrong:
![image](https://github.com/mit-han-lab/efficientvit/assets/129270219/cdeaa9d5-c295-4f01-87fa-d400a7a25637)
how can I resolve this problem? thanks

han-cai commented 11 months ago

Can you give me more details about the error?

I can successfully launch the training job using the following script:

torchpack dist-run -np 8 \
python train_cls_model.py configs/cls/imagenet/b1.yaml \
    --data_provider.image_size "[128,160,192,224,256,288]" \
    --run_config.eval_image_size "[288]" \
    --path exp/cls/imagenet/b1_r288/