yiyulics / CSEC

:fire: [CVPR 2024] Color Shift Estimation-and-Correction for Image Enhancement
https://arxiv.org/abs/2405.17725
MIT License
54 stars 2 forks source link

Some errors on windows... #7

Closed zelenooki87 closed 1 month ago

zelenooki87 commented 1 month ago

python src/test.py checkpoint_path=pretrained/csec.ckpt Seed set to 233 C:\Users\Miki\Pictures\csec\src\test.py:15: UserWarning: The version_base parameter is not specified. Please specify a compatability version level, or None. Will assume defaults for version 1.1 @hydra.main(config_path="config", config_name="config") C:\Users\Miki\anaconda3\envs\drct\Lib\site-packages\hydra\_internal\defaults_list.py:251: UserWarning: In 'config': Defaults list is missingself. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/default_composition_order for more information warnings.warn(msg, UserWarning) C:\Users\Miki\anaconda3\envs\drct\Lib\site-packages\hydra\_internal\hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default. See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information. ret = run_job( Check runtime config: use "C:\Users\Miki\Pictures\csec\src\config\runtime\csecnet.default.yaml" as template. Running config: {'aug': {'crop': False, 'downsample': [512, 512], 'h-flip': True, 'v-flip': True}, 'train_ds': {'class': 'img_dataset', 'name': 'lcdp_data.train', 'input': ['/path/to/input/*'], 'GT': ['/path/to/gt/*']}, 'test_ds': {'class': 'img_dataset', 'name': 'lcdp_data.test', 'input': ['C:\\Users\\Miki\\Pictures\\csec\\test\\*'], 'GT': ['none']}, 'valid_ds': {'class': 'img_dataset', 'name': 'lcdp_data.valid', 'input': ['/path/to/valid-input/*'], 'GT': ['/path/to/valid-gt/*']}, 'runtime': {'bilateral_upsample_net': {'hist_unet': {'n_bins': 8, 'hist_as_guide': False, 'channel_nums': [16, 32, 64, 128, 256], 'encoder_use_hist': False, 'guide_feature_from_hist': True, 'region_num': 2, 'use_gray_hist': False, 'conv_type': 'drconv', 'down_ratio': 2, 'hist_conv_trainable': False, 'drconv_position': [0, 1]}, 'modelname': 'bilateral_upsample_net', 'predict_illumination': False, 'loss': {'mse': 1.0, 'cos': 0.1, 'ltv': 0.1}, 'luma_bins': 8, 'channel_multiplier': 1, 'spatial_bin': 16, 'batch_norm': True, 'low_resolution': 256, 'coeffs_type': 'matrix', 'conv_type': 'conv', 'backbone': 'hist-unet', 'illu_map_power': False}, 'hist_unet': {'n_bins': 8, 'hist_as_guide': False, 'channel_nums': False, 'encoder_use_hist': False, 'guide_feature_from_hist': False, 'region_num': 8, 'use_gray_hist': False, 'conv_type': 'drconv', 'down_ratio': 2, 'hist_conv_trainable': False, 'drconv_position': [1, 1]}, 'modelname': 'csecnet', 'use_wavelet': False, 'use_attn_map': False, 'use_non_local': False, 'how_to_fuse': 'cnn-weights', 'deform': True, 'backbone': 'bilateral_upsample_net', 'conv_type': 'conv', 'backbone_out_illu': True, 'illumap_channel': 1, 'share_weights': True, 'n_bins': 8, 'hist_as_guide': False, 'loss': {'ltv': 0, 'cos': 0, 'weighted_loss': 0, 'tvloss1': 0, 'tvloss2': 0, 'tvloss1_new': 0.01, 'tvloss2_new': 0.01, 'l1_loss': 1.0, 'ssim_loss': 1.0, 'psnr_loss': 0, 'illumap_loss': 0, 'hist_loss': 0, 'inter_hist_loss': 0, 'vgg_loss': 0.01, 'cos2': 0.5, 'normal_ex_loss': 0.1}}, 'project': 'default_proj', 'name': 'name', 'comment': False, 'debug': False, 'val_debug_step_nums': 2, 'gpu': -1, 'backend': 'ddp', 'runtime_precision': 16, 'amp_backend': 'native', 'amp_level': 'O1', 'dataloader_num_worker': 4, 'mode': 'train', 'logger': 'tb', 'num_epoch': 300, 'valid_every': 20, 'savemodel_every': 4, 'log_every': 2000, 'batchsize': 16, 'valid_batchsize': 1, 'lr': 0.0001, 'checkpoint_path': 'pretrained/csec.ckpt', 'checkpoint_monitor': 'loss', 'resume_training': True, 'monitor_mode': 'min', 'early_stop': False, 'valid_ratio': 0.1, 'flags': {}} C:\Users\Miki\anaconda3\envs\drct\Lib\site-packages\kornia\feature\lightglue.py:44: FutureWarning:torch.cuda.amp.custom_fwd(args...)is deprecated. Please usetorch.amp.custom_fwd(args..., device_type='cuda')instead. @torch.cuda.amp.custom_fwd(cast_inputs=torch.float32) Running initialization for BaseModel C:\Users\Miki\anaconda3\envs\drct\Lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( C:\Users\Miki\anaconda3\envs\drct\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum orNonefor 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=VGG16_Weights.IMAGENET1K_V1. You can also useweights=VGG16_Weights.DEFAULT` to get the most up-to-date weights. warnings.warn(msg) Creating directory: "pretrained\test_result\csecnet_pretrained_csec.ckpt@lcdp_data.test" TEST - Result save path: pretrained\test_result\csecnet_pretrained_csec.ckpt@lcdp_data.test Loading model from: pretrained/csec.ckpt Dataset augmentation: [ToPILImage(), Downsample([512, 512]), RandomHorizontalFlip(p=0.5), RandomVerticalFlip(p=0.5), ToTensor()] Error executing job with overrides: ['checkpoint_path=pretrained/csec.ckpt'] Traceback (most recent call last): File "C:\Users\Miki\Pictures\csec\src\test.py", line 33, in main trainer = Trainer( ^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\drct\Lib\site-packages\pytorch_lightning\utilities\argparse.py", line 70, in insert_env_defaults return fn(self, **kwargs) ^^^^^^^^^^^^^^^^^^ TypeError: Trainer.init() got an unexpected keyword argument 'gpus'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.`

I installed latest dependenices, in python 3.12 conda environment

yiyulics commented 1 month ago

Please check pytorch-lightning version.

zelenooki87 commented 1 month ago

pytorch-lightning 2.4.0 lightning 2.4.0
torch 2.5.1+cu124

I have installed all the latest officials pypi packages.... Was so excited to try but no success.. Could you share your conda list (environment setup) please?

yiyulics commented 1 month ago

pytorch-lightning 2.4.0 lightning 2.4.0 torch 2.5.1+cu124

I have installed all the latest officials pypi packages.... Was so excited to try but no success.. Could you share your conda list (environment setup) please?

You can try to install the exact versions listed in the README.

zelenooki87 commented 1 month ago

I created new env as described, must to downgrade pip to install specific pytorch-lightning version. then torchmetrics was not compatible. downgraded to pip install torchmetrics==0.10.0 even after that code force to use older version of cuda instead of that for my GPU (RTX 3090) which is not compatible.

Once again I have created new environment with latest pytorch with cuda 12.x support, pytorch lightning as you described....

NEW error, please advice? Running initialization for BaseModel C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum orNonefor 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=VGG16_Weights.IMAGENET1K_V1. You can also useweights=VGG16_Weights.DEFAULT` to get the most up-to-date weights. warnings.warn(msg) [ WARN ] Result directory "csecnet_pretrained_csec.ckpt@lcdp_data.test" exists. Press ENTER to overwrite or input suffix to create a new one:

New name: csecnet_pretrained_csec.ckpt@lcdp_data.test. [ WARN ] Overwrite result_dir: csecnet_pretrained_csec.ckpt@lcdp_data.test TEST - Result save path: pretrained\test_result\csecnet_pretrained_csec.ckpt@lcdp_data.test Loading model from: pretrained/csec.ckpt Dataset augmentation: [ToPILImage(), Downsample([512, 512]), RandomHorizontalFlip(p=0.5), RandomVerticalFlip(p=0.5), ToTensor()] C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py:447: LightningDeprecationWarning: Setting Trainer(gpus=-1) is deprecated in v1.7 and will be removed in v2.0. Please use Trainer(accelerator='gpu', devices=-1) instead. rank_zero_deprecation( Using 16bit native Automatic Mixed Precision (AMP) C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\plugins\precision\native_amp.py:53: FutureWarning: torch.cuda.amp.GradScaler(args...) is deprecated. Please use torch.amp.GradScaler('cuda', args...) instead. scaler = torch.cuda.amp.GradScaler() GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs Global seed set to 233 Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1 Error executing job with overrides: ['checkpoint_path=pretrained/csec.ckpt'] Traceback (most recent call last): File "C:\Users\Miki\Pictures\csec\src\test.py", line 38, in main trainer.test(model, datamodule) File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 862, in test return self._call_and_handle_interrupt(self._test_impl, model, dataloaders, ckpt_path, verbose, datamodule) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 648, in _call_and_handle_interrupt return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\strategies\launchers\subprocess_script.py", line 93, in launch return function(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 909, in _test_impl results = self._run(model, ckpt_path=self.ckpt_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1102, in _run self.strategy.setup_environment() File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\strategies\ddp.py", line 157, in setup_environment self.setup_distributed() File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\strategies\ddp.py", line 210, in setup_distributed init_dist_connection(self.cluster_environment, self._process_group_backend, timeout=self._timeout) File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\pytorch_lightning\utilities\distributed.py", line 374, in init_dist_connection torch.distributed.init_process_group(torch_distributed_backend, rank=global_rank, world_size=world_size, kwargs) File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\torch\distributed\c10d_logger.py", line 83, in wrapper return func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\torch\distributed\c10d_logger.py", line 97, in wrapper func_return = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\torch\distributed\distributed_c10d.py", line 1520, in init_process_group store, rank, world_size = next(rendezvous_iterator) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\torch\distributed\rendezvous.py", line 269, in _env_rendezvous_handler store = _create_c10d_store( ^^^^^^^^^^^^^^^^^^^ File "C:\Users\Miki\anaconda3\envs\csec\Lib\site-packages\torch\distributed\rendezvous.py", line 189, in _create_c10d_store return TCPStore( ^^^^^^^^^ RuntimeError: use_libuv was requested but PyTorch was build without libuv support

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.`

yiyulics commented 1 month ago

It seems like still a version compatibility problem. Btw, I can run on RTX4090 and RTX3080. You can configure your CUDA version to fit the environment.

zelenooki87 commented 1 month ago

Could you please run "conda list" and paste all packages here (or pastebin) please? Even on Google colab I cannot run code:

Check runtime config: use "/content/CSEC/src/config/runtime/csecnet.default.yaml" as template. Running config: {'aug': {'crop': False, 'downsample': [512, 512], 'h-flip': True, 'v-flip': True}, 'train_ds': {'class': 'img_dataset', 'name': 'lcdp_data.train', 'input': ['/path/to/input/*'], 'GT': ['/path/to/gt/*']}, 'test_ds': {'class': 'img_dataset', 'name': 'lcdp_data.test', 'input': ['/content/drive/MyDrive/test/'], 'GT': ['/content/drive/MyDrive/test/']}, 'valid_ds': {'class': 'img_dataset', 'name': 'lcdp_data.valid', 'input': ['/path/to/valid-input/*'], 'GT': ['/path/to/valid-gt/*']}, 'runtime': {'bilateral_upsample_net': {'hist_unet': {'n_bins': 8, 'hist_as_guide': False, 'channel_nums': [16, 32, 64, 128, 256], 'encoder_use_hist': False, 'guide_feature_from_hist': True, 'region_num': 2, 'use_gray_hist': False, 'conv_type': 'drconv', 'down_ratio': 2, 'hist_conv_trainable': False, 'drconv_position': [0, 1]}, 'modelname': 'bilateral_upsample_net', 'predict_illumination': False, 'loss': {'mse': 1.0, 'cos': 0.1, 'ltv': 0.1}, 'luma_bins': 8, 'channel_multiplier': 1, 'spatial_bin': 16, 'batch_norm': True, 'low_resolution': 256, 'coeffs_type': 'matrix', 'conv_type': 'conv', 'backbone': 'hist-unet', 'illu_map_power': False}, 'hist_unet': {'n_bins': 8, 'hist_as_guide': False, 'channel_nums': False, 'encoder_use_hist': False, 'guide_feature_from_hist': False, 'region_num': 8, 'use_gray_hist': False, 'conv_type': 'drconv', 'down_ratio': 2, 'hist_conv_trainable': False, 'drconv_position': [1, 1]}, 'modelname': 'csecnet', 'use_wavelet': False, 'use_attn_map': False, 'use_non_local': False, 'how_to_fuse': 'cnn-weights', 'deform': True, 'backbone': 'bilateral_upsample_net', 'conv_type': 'conv', 'backbone_out_illu': True, 'illumap_channel': 1, 'share_weights': True, 'n_bins': 8, 'hist_as_guide': False, 'loss': {'ltv': 0, 'cos': 0, 'weighted_loss': 0, 'tvloss1': 0, 'tvloss2': 0, 'tvloss1_new': 0.01, 'tvloss2_new': 0.01, 'l1_loss': 1.0, 'ssim_loss': 1.0, 'psnr_loss': 0, 'illumap_loss': 0, 'hist_loss': 0, 'inter_hist_loss': 0, 'vgg_loss': 0.01, 'cos2': 0.5, 'normal_ex_loss': 0.1}}, 'project': 'default_proj', 'name': 'name', 'comment': False, 'debug': False, 'val_debug_step_nums': 2, 'gpu': -1, 'backend': 'ddp', 'runtime_precision': 16, 'amp_backend': 'native', 'amp_level': 'O1', 'dataloader_num_worker': 4, 'mode': 'train', 'logger': 'tb', 'num_epoch': 300, 'valid_every': 20, 'savemodel_every': 4, 'log_every': 2000, 'batchsize': 16, 'valid_batchsize': 1, 'lr': 0.0001, 'checkpoint_path': '/content/CSEC/CSEC/csec.ckpt', 'checkpoint_monitor': 'loss', 'resume_training': True, 'monitor_mode': 'min', 'early_stop': False, 'valid_ratio': 0.1, 'flags': {}} /usr/local/lib/python3.10/dist-packages/kornia/feature/lightglue.py:44: FutureWarning:torch.cuda.amp.custom_fwd(args...)is deprecated. Please usetorch.amp.custom_fwd(args..., device_type='cuda')instead. @torch.cuda.amp.custom_fwd(cast_inputs=torch.float32) /usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/cloud_io.py:47: FutureWarning: You are usingtorch.loadwithweights_only=False(the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value forweights_onlywill be flipped toTrue. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user viatorch.serialization.add_safe_globals. We recommend you start settingweights_only=Truefor any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. return torch.load(f, map_location=map_location) Running initialization for BaseModel /usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( /usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum orNonefor 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=VGG16_Weights.IMAGENET1K_V1. You can also useweights=VGG16_Weights.DEFAULT` to get the most up-to-date weights. warnings.warn(msg) [ WARN ] Result directory "csecnet_CSEC_csec.ckpt@lcdp_data.test" exists. Press ENTER to overwrite or input suffix to create a new one:

New name: csecnet_CSEC_csec.ckpt@lcdp_data.test.miki Creating directory: "/content/CSEC/CSEC/test_result/csecnet_CSEC_csec.ckpt@lcdp_data.test.miki" TEST - Result save path: /content/CSEC/CSEC/test_result/csecnet_CSEC_csec.ckpt@lcdp_data.test.miki Loading model from: /content/CSEC/CSEC/csec.ckpt Dataset augmentation: [ToPILImage(), Downsample([512, 512]), RandomHorizontalFlip(p=0.5), RandomVerticalFlip(p=0.5), ToTensor()] /usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:447: LightningDeprecationWarning: Setting Trainer(gpus=-1) is deprecated in v1.7 and will be removed in v2.0. Please use Trainer(accelerator='gpu', devices=-1) instead. rank_zero_deprecation( Using 16bit native Automatic Mixed Precision (AMP) /usr/local/lib/python3.10/dist-packages/pytorch_lightning/plugins/precision/native_amp.py:53: FutureWarning: torch.cuda.amp.GradScaler(args...) is deprecated. Please use torch.amp.GradScaler('cuda', args...) instead. scaler = torch.cuda.amp.GradScaler() GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs Global seed set to 233 Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1

distributed_backend=nccl All distributed processes registered. Starting with 1 processes

Missing logger folder: /content/CSEC/lightning_logs test_ds - GT Directory path: [yellow]['/content/drive/MyDrive/test/'][/yellow] test_ds - Input Directory path: [yellow]['/content/drive/MyDrive/test/'][/yellow] test_ds Dataset length: 1, batch num: 1 LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] /usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:617: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn( Testing: 0it [00:00, ?it/s]Error executing job with overrides: ['checkpoint_path=/content/CSEC/CSEC/csec.ckpt'] Traceback (most recent call last): File "/content/CSEC/src/test.py", line 38, in main trainer.test(model, datamodule) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 862, in test return self._call_and_handle_interrupt(self._test_impl, model, dataloaders, ckpt_path, verbose, datamodule) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 648, in _call_and_handle_interrupt return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, kwargs) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch return function(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 909, in _test_impl results = self._run(model, ckpt_path=self.ckpt_path) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run results = self._run_stage() File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1249, in _run_stage return self._run_evaluate() File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1295, in _run_evaluate eval_loop_results = self._evaluation_loop.run() File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/loop.py", line 200, in run self.advance(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/loop.py", line 200, in run self.advance(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 127, in advance batch = next(data_fetcher) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/fetching.py", line 184, in next return self.fetching_function() File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/fetching.py", line 263, in fetching_function self._fetch_next_batch(self.dataloader_iter) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/fetching.py", line 277, in _fetch_next_batch batch = next(iterator) File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 701, in next data = self._next_data() File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1465, in _next_data return self._process_data(data) File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1491, in _process_data data.reraise() File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 715, in reraise raise exception TypeError: Caught TypeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 351, in _worker_loop data = fetcher.fetch(index) # type: ignore[possibly-undefined] File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 52, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/content/CSEC/src/data/img_dataset.py", line 132, in getitem input_img = cv2.imread(self.input_list[idx])[:, :, [2, 1, 0]] TypeError: 'NoneType' object is not subscriptable

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. Testing: 0it [00:00, ?it/s] [rank0]:[W1030 10:18:56.746044108 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) `

yiyulics commented 1 month ago

Have you ever changed the data path as described in the README? The error message indicates that your data list is None, suggesting you may not have changed the data path yet.

PLEASE read the error information and debug yourself.