Open cuikf opened 2 years ago
@cuikf This is highly probably due to the bottleneck at data providing stage. The ability of data provider is usually due to the CPU and memory of the training machine (that's why training under large batch size often requires high-performance machine). You can try to reduce the batch size to ease this issue.
@MARMOTatZJU Thank u!I'll try it!
When I use python ./main/train.py --config 'experiments/siamfcpp/train/lasot/siamfcpp_alexnet-trn.yaml', the GPU-Util always jumps from 89% seconds later to 0%, then 89% again.
in the .yaml: num_processes: 2 minibatch: &MINIBATCH 128 num_workers: 64
I'm very confused