Media-Smart / vedastr

A scene text recognition toolbox based on PyTorch
Apache License 2.0
534 stars 100 forks source link

CSTR process group has not been initialized #60

Closed ELDIABLO4400 closed 3 years ago

ELDIABLO4400 commented 3 years ago

I try to train cstr model with command: python tools/train.py configs/cstr.py I recieved error:

Exception has occurred: RuntimeError
Default process group has not been initialized, please make sure to call init_process_group.
  File "/home/vedastr/vedastr/models/utils/conv_module.py", line 158, in forward
    x = self.norm(x)
  File "/home/vedastr/vedastr/models/bodies/feature_extractors/encoders/backbones/general_backbone.py", line 42, in forward
    x = layer(x)
  File "/home/vedastr/vedastr/models/bodies/component.py", line 20, in forward
    return self.component(x)
  File "/home/vedastr/vedastr/models/bodies/body.py", line 37, in forward
    out = component(inp)
  File "/home/vedastr/vedastr/models/model.py", line 21, in forward
    x = self.body(inputs[0])
  File "/home/vedastr/vedastr/runners/train_runner.py", line 115, in _train_batch
    pred = self.model((img,))
  File "/home/vedastr/vedastr/runners/train_runner.py", line 165, in __call__
    self._train_batch(img, label)
  File "/home/vedastr/tools/train.py", line 45, in main
    runner()
  File "/home/vedastr/tools/train.py", line 49, in <module>
    main()

Can you tell me please, where I wrong?

ChaseMonsterAway commented 3 years ago

@ELDIABLO4400 Hi. we set default batch normalization as SyncBn to use distributed training which will cause the problem when run with DataParallel mode. You can use this command tools/dist_train.py configs/cstr.py 4.