TypeError: cannot pickle 'dict_keys' object

qfwysw commented 2 years ago

Envriment fatal: not a git repository (or any parent up to mount point /opt/data) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). sys.platform: linux Python: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 3090 CUDA_HOME: /usr/local/cuda NVCC: Build cuda_11.1.TC455_06.29190527_0 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.9.0+cu111 PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
CuDNN 8.0.5
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0+cu111 OpenCV: 4.5.5 MMCV: 1.4.8 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.1 MMDetection: 2.22.0 MMSegmentation: 0.22.1 MMDetection3D: 1.0.0rc0+

command ./tools/dist_train.sh ./projects/configs/detr3d/detr3d_res101_gridmask.py 1

Error Traceback (most recent call last): File "./tools/train.py", line 248, in main() File "./tools/train.py", line 237, in main train_model( File "/opt/data/private/glchen/projects/detr3d/mmdection3d/mmdet3d/apis/train.py", line 64, in train_model train_detector( File "/opt/data/private/glchen/projects/detr3d/mmdection/mmdet/apis/train.py", line 208, in train_detector runner.run(data_loaders, cfg.workflow) File "/opt/conda/envs/dr/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], **kwargs) File "/opt/conda/envs/dr/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 47, in train for i, data_batch in enumerate(self.data_loader): File "/opt/conda/envs/dr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 359, in iter return self._get_iterator() File "/opt/conda/envs/dr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 305, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "/opt/conda/envs/dr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 918, in init w.start() File "/opt/conda/envs/dr/lib/python3.8/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/opt/conda/envs/dr/lib/python3.8/multiprocessing/context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "/opt/conda/envs/dr/lib/python3.8/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/opt/conda/envs/dr/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj) File "/opt/conda/envs/dr/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/opt/conda/envs/dr/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/opt/conda/envs/dr/lib/python3.8/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) TypeError: cannot pickle 'dict_keys' object

Results ./projects/configs/detr3d/detr3d_res101_gridmask.py --gpu-ids 7 I want to implement some ideas of my own based on mmdect3d. When I execute the above command, the program runs fine. But when I use the distributed training method, even if I only use one gpu, I will get such an error.

konyul commented 2 years ago

+1

konyul commented 2 years ago

Add torch.multiprocessing.set_start_method('fork') in train.py, like this:

if __name__ == '__main__':
    torch.multiprocessing.set_start_method('fork')
    main()

Tai-Wang commented 2 years ago

It seems like a bug in the project DETR3D. Please follow its instruction of using mmdet3d and create issues for discussion there, which can guide you to the correct solution more directly.

Darkzj commented 2 years ago

I add torch.multiprocessing.set_start_method('fork') , but error occured：ValueError: cannot find context for 'fork'，how to solve it？

I set workers_per_gpu=0，it works.

Darkzj commented 2 years ago

I add torch.multiprocessing.set_start_method('fork') , but error occured：ValueError: cannot find context for 'fork'，how to solve it？

I set workers_per_gpu=0，it works.

my sys.platform is windows, so 'fork' is not supported.

hzm-January commented 1 year ago

I add torch.multiprocessing.set_start_method('fork') , but error occured：ValueError: cannot find context for 'fork'，how to solve it？

I set workers_per_gpu=0，it works.

how to solve it,thx

open-mmlab / mmdetection3d

TypeError: cannot pickle 'dict_keys' object #1364