vqdang / hover_net

Simultaneous Nuclear Instance Segmentation and Classification in H&E Histology Images.
MIT License
537 stars 224 forks source link

GPU detection and related issues with tensors #284

Closed ale2sia-g closed 1 month ago

ale2sia-g commented 2 months ago

Hi, I am new to deep learning and I am meeting an error while trying to train my net. When I run python run_train.py --gpu='0' no GPU is detected (Detect #GPUS: 0), also I get:

  File "C:\Users\aless\anaconda3\envs\hovernet\lib\site-packages\torch\utils\data\dataloader.py", line 230, in __init__
    batch_sampler = BatchSampler(sampler, batch_size, drop_last)
  File "C:\Users\aless\anaconda3\envs\hovernet\lib\site-packages\torch\utils\data\sampler.py", line 198, in __init__
    "but got batch_size={}".format(batch_size))
ValueError: batch_size should be a positive integer value, but got batch_size=0

Which I think is due to the lack of detected GPUs. I tried with various numbers (gpu='1', etc) but still the error persists. So I tried and ran python run_train.py without specifying the GPU and this way it is actually detected, but then an error related to tensors comes up:

(hovernet) C:\Users\aless\Desktop\tesi\hover_net_monuseg>python run_train.py
Detect #GPUS: 1                              
Using manual seed: 10
Dataset train: 637
Dataset valid: 147
Traceback (most recent call last):
  File "run_train.py", line 305, in <module>
    trainer.run()
  File "run_train.py", line 289, in run
    phase_info, engine_opt, save_path, prev_log_dir=prev_save_path
  File "run_train.py", line 184, in run_once
    net_desc = net_info["desc"]()
  File "C:\Users\aless\Desktop\tesi\hover_net_monuseg\models\hovernet\opt.py", line 35, in <lambda>
    freeze=True, mode=mode
  File "C:\Users\aless\Desktop\tesi\hover_net_monuseg\models\hovernet\net_desc.py", line 152, in create_model
    return HoVerNet(mode=mode, **kwargs)
  File "C:\Users\aless\Desktop\tesi\hover_net_monuseg\models\hovernet\net_desc.py", line 90, in __init__
    ("tp", create_decoder_branch(ksize=ksize, out_ch=nr_types)),
  File "C:\Users\aless\Desktop\tesi\hover_net_monuseg\models\hovernet\net_desc.py", line 67, in create_decoder_branch
    ("conv", nn.Conv2d(64, out_ch, 1, stride=1, padding=0, bias=True),),
  File "C:\Users\aless\anaconda3\envs\hovernet\lib\site-packages\torch\nn\modules\conv.py", line 408, in __init__
    False, _pair(0), groups, bias, padding_mode)
  File "C:\Users\aless\anaconda3\envs\hovernet\lib\site-packages\torch\nn\modules\conv.py", line 83, in __init__
    self.reset_parameters()
  File "C:\Users\aless\anaconda3\envs\hovernet\lib\site-packages\torch\nn\modules\conv.py", line 86, in reset_parameters
    init.kaiming_uniform_(self.weight, a=math.sqrt(5))
  File "C:\Users\aless\anaconda3\envs\hovernet\lib\site-packages\torch\nn\init.py", line 381, in kaiming_uniform_
    fan = _calculate_correct_fan(tensor, mode)
  File "C:\Users\aless\anaconda3\envs\hovernet\lib\site-packages\torch\nn\init.py", line 350, in _calculate_correct_fan
    fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor)
  File "C:\Users\aless\anaconda3\envs\hovernet\lib\site-packages\torch\nn\init.py", line 282, in _calculate_fan_in_and_fan_out
    receptive_field_size = tensor[0][0].numel()
IndexError: index 0 is out of bounds for dimension 0 with size 0

I really do not know how to manage it and what the cause could be. Send help please?