jbohnslav / deepethogram

Other
98 stars 32 forks source link

[GUI] GPU not found? pytorch_lightning.utilities.exceptions.MisconfigurationException #156

Open karinmcode opened 11 months ago

karinmcode commented 11 months ago

Hi, I have an Nvidia Quadro P4000 GPU with id 0 in the task manager (it is listed as "GPU 0"). But it seems that pytorch_lightning does not find the GPU

ERROR

  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\deepethogram\base.py", line 386, in get_trainer_from_cfg
    trainer = pl.Trainer(gpus=[cfg.compute.gpu_id],
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\trainer\connectors\env_vars_connector.py", line 38, in insert_env_defaults
    return fn(self, **kwargs)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 426, in __init__
    gpu_ids, tpu_cores = self._parse_devices(gpus, auto_select_gpus, tpu_cores)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1543, in _parse_devices
    gpu_ids = device_parser.parse_gpu_ids(gpus)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\utilities\device_parser.py", line 89, in parse_gpu_ids
    return _sanitize_gpu_ids(gpus)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\utilities\device_parser.py", line 151, in _sanitize_gpu_ids
    raise MisconfigurationException(
pytorch_lightning.utilities.exceptions.MisconfigurationException: You requested GPUs: [0]
 But your machine only has: []

ALL COMMAND WINDOW MESSAGES

(deg38) C:\Users\labadmin>python -m deepethogram
[2023-10-22 13:22:57,674] INFO [deepethogram.gui.main.setup_gui_cfg:1268] CWD: C:\Users\labadmin
[2023-10-22 13:22:57,674] INFO [deepethogram.gui.main.setup_gui_cfg:1269] Configuration used: split:
  reload: true
  file: null
  train_val_test:
  - 0.8
  - 0.2
  - 0.0
compute:
  fp16: false
  num_workers: 8
  batch_size: auto
  min_batch_size: 8
  max_batch_size: 512
  distributed: false
  gpu_id: 0
  dali: false
  metrics_workers: 0
reload:
  overwrite_cfg: false
  latest: false
notes: null
log:
  level: info
run:
  type: gui
label_view_width: 31
control_arrow_jump: 31
vertical_arrow_jump: 3
cmap: deepethogram
unlabeled_alpha: 0.1
prediction_opacity: 0.2

[2023-10-22 13:23:14,047] INFO [deepethogram.gui.main.initialize_project:1017] cwd: C:\Users\labadmin
[2023-10-22 13:23:14,055] INFO [deepethogram.projects.convert_config_paths_to_absolute:1135] cwd in absolute: C:\Users\labadmin
[2023-10-22 13:23:14,063] INFO [deepethogram.projects.convert_config_paths_to_absolute:1178] after absolute: {'class_names': ['background', 'resting', 'adjusting', 'walking', 'running', 'grooming', 'sniffing', 'sound'], 'config_file': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\project_config.yaml', 'data_path': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\DATA', 'labeler': None, 'model_path': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\models', 'name': 'test2p', 'path': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram', 'pretrained_path': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\models\\pretrained_models'}
[2023-10-22 13:23:14,063] INFO [deepethogram.gui.main.initialize_project:1019] cwd: C:\Users\labadmin
[2023-10-22 13:23:14,130] INFO [deepethogram.gui.main.initialize_project:1021] loaded project configuration: split:
  reload: true
  file: null
  train_val_test:
  - 0.8
  - 0.2
  - 0.0
compute:
  fp16: false
  num_workers: 8
  batch_size: 32
  min_batch_size: 8
  max_batch_size: 512
  distributed: false
  gpu_id: 0
  dali: false
  metrics_workers: 0
reload:
  overwrite_cfg: false
  latest: false
notes: null
log:
  level: info
run:
  type: gui
  model: null
  dir: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\gui_logs\231022_132314
label_view_width: 31
control_arrow_jump: 31
vertical_arrow_jump: 3
cmap: deepethogram
unlabeled_alpha: 0.1
prediction_opacity: 0.2
postprocessor:
  type: min_bout_per_behavior
  min_bout_length: 1
augs:
  LR: 0.5
  UD: 0.0
  brightness: 0.25
  contrast: 0.1
  crop_size: null
  degrees: 10
  grayscale: 0.5
  hue: 0.1
  normalization:
    'N': 65286144
    mean:
    - 0.38076755964346387
    - 0.38076755964346387
    - 0.38076755964346387
    std:
    - 0.2487480534020329
    - 0.2487480534020329
    - 0.2487480534020329
  pad: null
  random_resize: false
  resize:
  - 224
  - 224
  saturation: 0.1
project:
  class_names:
  - background
  - resting
  - adjusting
  - walking
  - running
  - grooming
  - sniffing
  - sound
  config_file: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\project_config.yaml
  data_path: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\DATA
  labeler: null
  model_path: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models
  name: test2p
  path: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram
  pretrained_path: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models\pretrained_models
sequence:
  filter_length: 15
train:
  loss_weight_exp: 1.0

[2023-10-22 13:23:14,138] INFO [deepethogram.gui.main.initialize_project:1022] cwd: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\gui_logs\231022_132314
[2023-10-22 13:23:14,509] INFO [deepethogram.gui.main.project_loaded_buttons:175] Number finalized labels: 4
[2023-10-22 13:23:20,501] INFO [deepethogram.gui.main.initialize_video:226] Record for loaded video: {'flow': None, 'label': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\DATA\\CAM1_m532_211115_001\\CAM1_m532_211115_001_labels.csv', 'output': None, 'rgb': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\DATA\\CAM1_m532_211115_001\\CAM1_m532_211115_001.mp4', 'keypoint': None, 'key': 'CAM1_m532_211115_001'}
[2023-10-22 13:23:40,828] INFO [deepethogram.gui.main.get_selected_models:1136] {'flow_generator': {'no pretrained weights': None, '200221_115158_TinyMotionNet': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\models\\pretrained_models\\200221_115158_TinyMotionNet\\checkpoint.pt'}, 'feature_extractor': {'no pretrained weights': None, '200415_125824_hidden_two_stream_kinetics_degf': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\models\\pretrained_models\\200415_125824_hidden_two_stream_kinetics_degf\\checkpoint.pt'}, 'sequence': {'': None}}
[2023-10-22 13:23:40,832] INFO [deepethogram.gui.main.flow_train:343] flow_train called with args: ['python', '-m', 'deepethogram.flow_generator.train', 'project.path=H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram', 'reload.weights=H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\models\\pretrained_models\\200221_115158_TinyMotionNet\\checkpoint.pt']
[2023-10-22 13:23:50,134] INFO [deepethogram.projects.convert_config_paths_to_absolute:1135] cwd in absolute: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\gui_logs\231022_132314
[2023-10-22 13:23:50,142] INFO [deepethogram.projects.convert_config_paths_to_absolute:1178] after absolute: {'class_names': ['background', 'resting', 'adjusting', 'walking', 'running', 'grooming', 'sniffing', 'sound'], 'config_file': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\project_config.yaml', 'data_path': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\DATA', 'labeler': None, 'model_path': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\models', 'name': 'test2p', 'path': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram', 'pretrained_path': 'H:\\My Drive\\Research\\Schneider lab\\Paper\\Karin paper version 230911\\Reviewers requests\\Fig 5 Automated behavior classification\\DeepEthogram\\test2p_deepethogram\\models\\pretrained_models'}
[2023-10-22 13:23:50,203] INFO [__main__.flow_generator_train:54] args: C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\deepethogram\flow_generator\train.py project.path=H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram reload.weights=H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models\pretrained_models\200221_115158_TinyMotionNet\checkpoint.pt
[2023-10-22 13:23:50,221] INFO [__main__.flow_generator_train:62] configuration used ~~~~~
[2023-10-22 13:23:50,237] INFO [__main__.flow_generator_train:63] split:
  reload: true
  file: null
  train_val_test:
  - 0.8
  - 0.2
  - 0.0
compute:
  fp16: false
  num_workers: 8
  batch_size: 32
  min_batch_size: 8
  max_batch_size: 512
  distributed: false
  gpu_id: 0
  dali: false
  metrics_workers: 0
reload:
  overwrite_cfg: false
  latest: false
  weights: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models\pretrained_models\200221_115158_TinyMotionNet\checkpoint.pt
notes: null
log:
  level: info
augs:
  brightness: 0.25
  contrast: 0.1
  hue: 0.1
  saturation: 0.1
  color_p: 0.5
  grayscale: 0.5
  crop_size: null
  resize:
  - 224
  - 224
  dali: false
  random_resize: false
  pad: null
  LR: 0.5
  UD: 0.0
  degrees: 10
  normalization:
    'N': 65286144
    mean:
    - 0.38076755964346387
    - 0.38076755964346387
    - 0.38076755964346387
    std:
    - 0.2487480534020329
    - 0.2487480534020329
    - 0.2487480534020329
train:
  lr: 0.0001
  scheduler: plateau
  num_epochs: 10
  steps_per_epoch:
    train: 1000
    val: 200
    test: 20
  min_lr: 5.0e-07
  stopping_type: learning_rate
  milestones:
  - 50
  - 100
  - 150
  - 200
  - 250
  - 300
  weight_loss: true
  patience: 3
  early_stopping_begins: 0
  viz_metrics: true
  viz_examples: 10
  reduction_factor: 0.1
  loss_weight_exp: 1.0
  loss_gamma: 1.0
  label_smoothing: 0.05
  oversampling_exp: 0.0
  regularization:
    style: l2_sp
    alpha: 1.0e-05
    beta: 0.001
flow_generator:
  type: flow_generator
  flow_loss: MotionNet
  flow_max: 10
  input_images: 11
  flow_sparsity: false
  smooth_weight_multiplier: 1.0
  sparsity_weight: 0.0
  loss: MotionNet
  max: 5
  n_rgb: 11
  arch: TinyMotionNet
  weights: pretrained
cmap: deepethogram
control_arrow_jump: 31
label_view_width: 31
postprocessor:
  min_bout_length: 1
  type: min_bout_per_behavior
prediction_opacity: 0.2
project:
  class_names:
  - background
  - resting
  - adjusting
  - walking
  - running
  - grooming
  - sniffing
  - sound
  config_file: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\project_config.yaml
  data_path: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\DATA
  labeler: null
  model_path: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models
  name: test2p
  path: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram
  pretrained_path: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models\pretrained_models
run:
  type: train
  model: flow_generator
  dir: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers
    requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models\231022_132350_flow_generator_train
sequence:
  filter_length: 15
unlabeled_alpha: 0.1
vertical_arrow_jump: 3

[2023-10-22 13:23:50,719] INFO [__main__.flow_generator_train:67] Total trainable params: 1,951,784
[2023-10-22 13:23:52,631] INFO [deepethogram.projects.get_weightfile_from_cfg:1068] loading pretrained weights: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models\pretrained_models\200221_115158_TinyMotionNet\checkpoint.pt
reloading weights...
[2023-10-22 13:23:52,631] INFO [deepethogram.utils.load_state:341] loading from checkpoint file H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models\pretrained_models\200221_115158_TinyMotionNet\checkpoint.pt...
[2023-10-22 13:23:52,767] INFO [__main__.get_metrics:364] key metric is SSIM
[2023-10-22 13:23:52,815] INFO [deepethogram.data.augs.get_gpu_transforms:246] GPU transforms: {'train': Sequential(
  (0): ToFloat()
  (1): VideoSequential(
    (RandomHorizontalFlip_0): RandomHorizontalFlip(p=0.5, p_batch=1.0, same_on_batch=False)
    (RandomRotation_1): RandomRotation(degrees=10, p=0.5, p_batch=1.0, same_on_batch=False, resample=bilinear, align_corners=True)
    (ColorJitter_2): ColorJitter(brightness=0.25, contrast=0.1, saturation=0.1, hue=0.1, p=0.5, p_batch=1.0, same_on_batch=False)
    (RandomGrayscale_3): RandomGrayscale(p=0.5, p_batch=1.0, same_on_batch=False)
  )
  (2): NormalizeVideo()
  (3): StackClipInChannels()
), 'val': Sequential(
  (0): ToFloat()
  (1): NormalizeVideo()
  (2): StackClipInChannels()
), 'test': Sequential(
  (0): ToFloat()
  (1): NormalizeVideo()
  (2): StackClipInChannels()
), 'denormalize': Sequential(
  (0): UnstackClip()
  (1): DenormalizeVideo()
)}
[2023-10-22 13:23:52,815] INFO [deepethogram.base.__init__:95] scheduler mode: min
C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\torch\cuda\__init__.py:138: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 9010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at C:\cb\pytorch_1000000000000\work\c10\cuda\CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() > 0
[2023-10-22 13:23:53,351] INFO [deepethogram.losses.get_regularization_loss:204] Regularization: L2_SP. Pretrained file: H:\My Drive\Research\Schneider lab\Paper\Karin paper version 230911\Reviewers requests\Fig 5 Automated behavior classification\DeepEthogram\test2p_deepethogram\models\pretrained_models\200221_115158_TinyMotionNet\checkpoint.pt alpha: 1e-05 beta: 0.001
[2023-10-22 13:23:53,469] INFO [deepethogram.flow_generator.losses.__init__:178] Using MotionNet Loss with settings: smooth_weights: [0.01, 0.02, 0.04, 0.08, 0.16] flow_sparsity: False sparsity_weight: 0.0
Traceback (most recent call last):
  File "C:\Users\labadmin\.conda\envs\deg38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\labadmin\.conda\envs\deg38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\deepethogram\flow_generator\train.py", line 374, in <module>
    flow_generator_train(cfg)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\deepethogram\flow_generator\train.py", line 78, in flow_generator_train
    trainer = get_trainer_from_cfg(cfg, lightning_module, stopper)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\deepethogram\base.py", line 386, in get_trainer_from_cfg
    trainer = pl.Trainer(gpus=[cfg.compute.gpu_id],
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\trainer\connectors\env_vars_connector.py", line 38, in insert_env_defaults
    return fn(self, **kwargs)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 426, in __init__
    gpu_ids, tpu_cores = self._parse_devices(gpus, auto_select_gpus, tpu_cores)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1543, in _parse_devices
    gpu_ids = device_parser.parse_gpu_ids(gpus)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\utilities\device_parser.py", line 89, in parse_gpu_ids
    return _sanitize_gpu_ids(gpus)
  File "C:\Users\labadmin\.conda\envs\deg38\lib\site-packages\pytorch_lightning\utilities\device_parser.py", line 151, in _sanitize_gpu_ids
    raise MisconfigurationException(
pytorch_lightning.utilities.exceptions.MisconfigurationException: You requested GPUs: [0]
 But your machine only has: []
[2023-10-22 13:23:54,958] INFO [deepethogram.gui.main.flow_train:353] Training finished. If you see error messages above, training did not complete successfully.
[2023-10-22 13:23:54,958] INFO [deepethogram.gui.main.flow_train:359] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[2023-10-22 13:23:55,157] INFO [deepethogram.gui.main.project_loaded_buttons:175] Number finalized labels: 4
[2023-10-22 13:24:20,776] INFO [deepethogram.gui.main.log_idle:151] User has been idle for 60.0 seconds...