Creating an issue with this - seems like some aspects of sample filtering are bugged with recent changes.
Originally posted by **mouryarahul** August 24, 2022
Hi,
I'm trying to run
`python train_unet_demo.py \`
`--mode test \`
`--test_split test \`
`--challenge singlecoil \`
`--data_path ../../../FastMRI_DATASET/knee_singlecoil_train/ \`
`--resume_from_checkpoint unet/unet_demo/checkpoints/epoch=1-step=69484.ckpt`
where `../../../FastMRI_DATASET/knee_singlecoil_train/ ` contains all three folders: `singlecoil_test`, `singlecoil_train` and `singlecoil_val`
However, I'm getting an error related to `raw_sample_filter` in the case of the test dataset. Maybe I am missing something or doing something silly. Can someone please point out the mistake? Thanks!
**Info about my environment:**
PyTorch version: 1.12.0+cu116
Is debug build: False
CUDA used to build PyTorch: 11.6
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04 LTS (x86_64)
GCC version: (Ubuntu 11.2.0-19ubuntu1) 11.2.0
Clang version: Could not collect
CMake version: version 3.22.1
Libc version: glibc-2.35
Python version: 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.15.0-46-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1070
Nvidia driver version: 515.65.01
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.22.3
[pip3] pytorch-lightning==1.7.2
[pip3] torch==1.12.0+cu116
[pip3] torchaudio==0.12.0+cu116
[pip3] torchmetrics==0.9.2
[pip3] torchvision==0.13.0+cu116
[conda] blas 1.0 mkl
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py310h7f8727e_0
[conda] mkl_fft 1.3.1 py310hd6ae3a3_0
[conda] mkl_random 1.2.2 py310h00e6091_0
[conda] numpy 1.22.3 py310hfa59a62_0
[conda] numpy-base 1.22.3 py310h9585f30_0
[conda] pytorch-lightning 1.7.2 pypi_0 pypi
[conda] torch 1.12.0+cu116 pypi_0 pypi
[conda] torchaudio 0.12.0+cu116 pypi_0 pypi
[conda] torchmetrics 0.9.2 pypi_0 pypi
[conda] torchvision 0.13.0+cu116 pypi_0 pypi
**The full error msg:**
Global seed set to 42
/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: Torchmetrics v0.9 introduced a new argument class property called `full_state_update` that has not been set for this class (DistributedMetricSum). The property determines if `update` by default needs access to the full metric state. If this is not the case, significant speedups can be achieved and we recommend setting this to `False`. We provide an checking function `from torchmetrics.utilities import check_forward_no_full_state` that can be used to check if the `full_state_update=True` (old and potential slower behaviour, default for now) or if `full_state_update=False` can be used safely.
warnings.warn(*args, **kwargs)
/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:446: LightningDeprecationWarning: Setting `Trainer(gpus=1)` is deprecated in v1.7 and will be removed in v2.0. Please use `Trainer(accelerator='gpu', devices=1)` instead.
rank_zero_deprecation(
/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py:52: LightningDeprecationWarning: Setting `Trainer(resume_from_checkpoint=)` is deprecated in v1.5 and will be removed in v1.7. Please pass `Trainer.fit(ckpt_path=)` directly instead.
rank_zero_deprecation(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Global seed set to 42
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 1 processes
----------------------------------------------------------------------------------------------------
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Traceback (most recent call last):
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri_examples/unet/train_unet_demo.py", line 191, in
run_cli()
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri_examples/unet/train_unet_demo.py", line 187, in run_cli
cli_main(args)
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri_examples/unet/train_unet_demo.py", line 75, in cli_main
trainer.test(model, datamodule=data_module)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 864, in test
return self._call_and_handle_interrupt(self._test_impl, model, dataloaders, ckpt_path, verbose, datamodule)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 648, in _call_and_handle_interrupt
return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch
return function(*args, **kwargs)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 911, in _test_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1168, in _run
results = self._run_stage()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1251, in _run_stage
return self._run_evaluate()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1291, in _run_evaluate
self._evaluation_loop._reload_evaluation_dataloaders()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 234, in _reload_evaluation_dataloaders
self.trainer.reset_test_dataloader()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1944, in reset_test_dataloader
self.num_test_batches, self.test_dataloaders = self._data_connector._reset_eval_dataloader(
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 348, in _reset_eval_dataloader
dataloaders = self._request_dataloader(mode)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 436, in _request_dataloader
dataloader = source.dataloader()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 513, in dataloader
return method()
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri/pl_modules/data_module.py", line 325, in test_dataloader
return self._create_data_loader(
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri/pl_modules/data_module.py", line 262, in _create_data_loader
raw_sample_filter=raw_sample_filter,
UnboundLocalError: local variable 'raw_sample_filter' referenced before assignment
Discussed in https://github.com/facebookresearch/fastMRI/discussions/263
Creating an issue with this - seems like some aspects of sample filtering are bugged with recent changes.