Closed SMSajadi99 closed 1 year ago
Hello everyone Thank you very much for your attractive project. As you said in the instructions, I went and installed the libraries and made the folders and put the
pkl
related to theV1.0-mini
version that I asked you about in the previous question in thedata/nuscenes
folder, but for train I had a problem and it doesn't start working. At first I will send the folder and after that I will send the error. Thank you for your help.├── ckpts ├── data │ └── nuscenes │ ├── maps │ │ ├── basemap │ │ ├── expansion │ │ └── prediction │ ├── samples │ │ ├── CAM_BACK │ │ ├── CAM_BACK_LEFT │ │ ├── CAM_BACK_RIGHT │ │ ├── CAM_FRONT │ │ ├── CAM_FRONT_LEFT │ │ ├── CAM_FRONT_RIGHT │ │ ├── LIDAR_TOP │ │ ├── RADAR_BACK_LEFT │ │ ├── RADAR_BACK_RIGHT │ │ ├── RADAR_FRONT │ │ ├── RADAR_FRONT_LEFT │ │ └── RADAR_FRONT_RIGHT │ ├── sweeps │ │ ├── CAM_BACK │ │ ├── CAM_BACK_LEFT │ │ ├── CAM_BACK_RIGHT │ │ ├── CAM_FRONT │ │ ├── CAM_FRONT_LEFT │ │ ├── CAM_FRONT_RIGHT │ │ ├── LIDAR_TOP │ │ ├── RADAR_BACK_LEFT │ │ ├── RADAR_BACK_RIGHT │ │ ├── RADAR_FRONT │ │ ├── RADAR_FRONT_LEFT │ │ └── RADAR_FRONT_RIGHT │ └── v1.0-mini ├── docs ├── figs ├── mmdetection3d │ ├── configs │ │ ├── 3dssd │ │ ├── _base_ │ │ │ ├── datasets │ │ │ ├── models │ │ │ └── schedules │ │ ├── benchmark │ │ ├── centerpoint │ │ ├── dgcnn │ │ ├── dynamic_voxelization │ │ ├── fcaf3d │ │ ├── fcos3d │ │ ├── free_anchor │ │ ├── groupfree3d │ │ ├── h3dnet │ │ ├── imvotenet │ │ ├── imvoxelnet │ │ ├── monoflex │ │ ├── mvxnet │ │ ├── nuimages │ │ ├── paconv │ │ ├── parta2 │ │ ├── pgd │ │ ├── pointnet2 │ │ ├── pointpillars │ │ ├── point_rcnn │ │ ├── regnet │ │ ├── sassd │ │ ├── second │ │ ├── smoke │ │ ├── ssn │ │ └── votenet │ ├── data │ │ ├── lyft │ │ ├── nuscenes │ │ │ ├── maps │ │ │ │ ├── basemap │ │ │ │ ├── expansion │ │ │ │ └── prediction │ │ │ ├── samples │ │ │ │ ├── CAM_BACK │ │ │ │ ├── CAM_BACK_LEFT │ │ │ │ ├── CAM_BACK_RIGHT │ │ │ │ ├── CAM_FRONT │ │ │ │ ├── CAM_FRONT_LEFT │ │ │ │ ├── CAM_FRONT_RIGHT │ │ │ │ ├── LIDAR_TOP │ │ │ │ ├── RADAR_BACK_LEFT │ │ │ │ ├── RADAR_BACK_RIGHT │ │ │ │ ├── RADAR_FRONT │ │ │ │ ├── RADAR_FRONT_LEFT │ │ │ │ └── RADAR_FRONT_RIGHT │ │ │ ├── sweeps │ │ │ │ ├── CAM_BACK │ │ │ │ ├── CAM_BACK_LEFT │ │ │ │ ├── CAM_BACK_RIGHT │ │ │ │ ├── CAM_FRONT │ │ │ │ ├── CAM_FRONT_LEFT │ │ │ │ ├── CAM_FRONT_RIGHT │ │ │ │ ├── LIDAR_TOP │ │ │ │ ├── RADAR_BACK_LEFT │ │ │ │ ├── RADAR_BACK_RIGHT │ │ │ │ ├── RADAR_FRONT │ │ │ │ ├── RADAR_FRONT_LEFT │ │ │ │ └── RADAR_FRONT_RIGHT │ │ │ └── v1.0-mini │ │ ├── s3dis │ │ │ └── meta_data │ │ ├── scannet │ │ │ └── meta_data │ │ └── sunrgbd │ │ └── matlab │ ├── demo │ │ └── data │ │ ├── kitti │ │ ├── nuscenes │ │ ├── scannet │ │ └── sunrgbd │ ├── docker │ │ └── serve │ ├── docs │ │ ├── en │ │ │ ├── datasets │ │ │ ├── _static │ │ │ │ ├── css │ │ │ │ └── image │ │ │ ├── supported_tasks │ │ │ └── tutorials │ │ └── zh_cn │ │ ├── datasets │ │ ├── _static │ │ │ ├── css │ │ │ └── image │ │ ├── supported_tasks │ │ └── tutorials │ ├── mmdet3d │ │ ├── apis │ │ ├── core │ │ │ ├── anchor │ │ │ ├── bbox │ │ │ │ ├── assigners │ │ │ │ ├── coders │ │ │ │ ├── iou_calculators │ │ │ │ ├── samplers │ │ │ │ └── structures │ │ │ ├── evaluation │ │ │ │ ├── kitti_utils │ │ │ │ ├── scannet_utils │ │ │ │ └── waymo_utils │ │ │ ├── points │ │ │ ├── post_processing │ │ │ ├── utils │ │ │ ├── visualizer │ │ │ └── voxel │ │ ├── datasets │ │ │ └── pipelines │ │ ├── models │ │ │ ├── backbones │ │ │ ├── decode_heads │ │ │ ├── dense_heads │ │ │ ├── detectors │ │ │ ├── fusion_layers │ │ │ ├── losses │ │ │ ├── middle_encoders │ │ │ ├── model_utils │ │ │ ├── necks │ │ │ ├── roi_heads │ │ │ │ ├── bbox_heads │ │ │ │ ├── mask_heads │ │ │ │ └── roi_extractors │ │ │ ├── segmentors │ │ │ ├── utils │ │ │ └── voxel_encoders │ │ ├── ops │ │ │ ├── dgcnn_modules │ │ │ ├── paconv │ │ │ ├── pointnet_modules │ │ │ └── spconv │ │ │ └── overwrite_spconv │ │ └── utils │ ├── mmdet3d.egg-info │ ├── projects │ │ └── example_project │ │ ├── configs │ │ └── dummy │ ├── requirements │ ├── resources │ ├── tests │ │ ├── data │ │ │ ├── kitti │ │ │ │ ├── kitti_gt_database │ │ │ │ └── training │ │ │ │ ├── image_2 │ │ │ │ ├── velodyne │ │ │ │ └── velodyne_reduced │ │ │ ├── lyft │ │ │ │ ├── lidar │ │ │ │ └── v1.01-train │ │ │ │ ├── maps │ │ │ │ └── v1.01-train │ │ │ ├── nuscenes │ │ │ │ ├── samples │ │ │ │ │ ├── CAM_BACK_LEFT │ │ │ │ │ └── LIDAR_TOP │ │ │ │ └── sweeps │ │ │ │ └── LIDAR_TOP │ │ │ ├── ops │ │ │ ├── s3dis │ │ │ │ ├── instance_mask │ │ │ │ ├── points │ │ │ │ └── semantic_mask │ │ │ ├── scannet │ │ │ │ ├── instance_mask │ │ │ │ ├── points │ │ │ │ └── semantic_mask │ │ │ ├── semantickitti │ │ │ │ └── sequences │ │ │ │ └── 00 │ │ │ │ ├── labels │ │ │ │ └── velodyne │ │ │ ├── sunrgbd │ │ │ │ ├── points │ │ │ │ └── sunrgbd_trainval │ │ │ │ └── image │ │ │ └── waymo │ │ │ ├── kitti_format │ │ │ │ ├── training │ │ │ │ │ ├── image_0 │ │ │ │ │ └── velodyne │ │ │ │ └── waymo_gt_database │ │ │ └── waymo_format │ │ │ └── validation │ │ ├── test_data │ │ │ ├── test_datasets │ │ │ └── test_pipelines │ │ │ ├── test_augmentations │ │ │ └── test_loadings │ │ ├── test_metrics │ │ ├── test_models │ │ │ ├── test_common_modules │ │ │ ├── test_fusion │ │ │ ├── test_heads │ │ │ ├── test_necks │ │ │ └── test_voxel_encoder │ │ ├── test_runtime │ │ ├── test_samples │ │ └── test_utils │ └── tools │ ├── analysis_tools │ ├── data_converter │ ├── deployment │ ├── misc │ └── model_converters ├── nusc_tracking ├── projects │ ├── configs │ │ ├── PETRv1 │ │ ├── StreamPETR │ │ └── test_speed │ └── mmdet3d_plugin │ ├── core │ │ ├── apis │ │ ├── bbox │ │ │ ├── assigners │ │ │ ├── coders │ │ │ └── match_costs │ │ └── evaluation │ ├── datasets │ │ ├── pipelines │ │ └── samplers │ └── models │ ├── backbones │ │ └── __pycache__ │ ├── dense_heads │ ├── detectors │ ├── necks │ └── utils └── tools └── data_converter └── __pycache__
`***
tools/train.py FAILED
Root Cause:
[0]: time: 2023-05-26_06:51:14 rank: 0 (local_rank: 0) exitcode: 1 (pid: 10264) error_file: <N/A> msg: "Process failed with exitcode 1"
Other Failures: [1]: time: 2023-05-26_06:51:14 rank: 1 (local_rank: 1) exitcode: 1 (pid: 10265) error_file: <N/A> msg: "Process failed with exitcode 1" [2]: time: 2023-05-26_06:51:14 rank: 2 (local_rank: 2) exitcode: 1 (pid: 10266) error_file: <N/A> msg: "Process failed with exitcode 1" [3]: time: 2023-05-26_06:51:14 rank: 3 (local_rank: 3) exitcode: 1 (pid: 10267) error_file: <N/A> msg: "Process failed with exitcode 1" [4]: time: 2023-05-26_06:51:14 rank: 4 (local_rank: 4) exitcode: 1 (pid: 10268) error_file: <N/A> msg: "Process failed with exitcode 1" [5]: time: 2023-05-26_06:51:14 rank: 5 (local_rank: 5) exitcode: 1 (pid: 10269) error_file: <N/A> msg: "Process failed with exitcode 1" [6]: time: 2023-05-26_06:51:14 rank: 6 (local_rank: 6) exitcode: 1 (pid: 10270) error_file: <N/A> msg: "Process failed with exitcode 1" [7]: time: 2023-05-26_06:51:14 rank: 7 (local_rank: 7) exitcode: 1 (pid: 10271) error_file: <N/A> msg: "Process failed with exitcode 1" ***`
Hi, thanks for your interest. I need more details. Would you please provide other errors above the providing error codes? And how many GPU devices did you use?
@SMSajadi99 Hi, what's your numba version? My numba version is 0.53.0.
(streampetr) sajadi@sajadi:~/anaconda3/envs/streampetr/StreamPETR$ tools/dist_train.sh projects/configs/StreamPETR/stream_petr_r50_flash_704_bs2_seq_24e.py 8 --work-dir work_dirs/stream_petr_r50_flash_704_bs2_seq_24e/ /home/sajadi/anaconda3/envs/streampetr/lib/python3.8/site-packages/torch/distributed/launch.py:163: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead logger.warn( The module torch.distributed.launch is deprecated and going to be removed in future.Migrate to torch.distributed.run
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
INFO:torch.distributed.launcher.api:Starting elastic_operator with launch configs: entrypoint : tools/train.py min_nodes : 1 max_nodes : 1 nproc_per_node : 8 run_id : none rdzv_backend : static rdzv_endpoint : 127.0.0.1:29500 rdzv_configs : {'rank': 0, 'timeout': 900} max_restarts : 3 monitor_interval : 5 log_dir : None metrics_cfg : {}
INFO:torch.distributed.elastic.agent.server.local_elastic_agent:log directory set to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t INFO:torch.distributed.elastic.agent.server.api:[default] starting workers for entrypoint: python INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group /home/sajadi/anaconda3/envs/streampetr/lib/python3.8/site-packages/torch/distributed/elastic/utils/store.py:52: FutureWarning: This is an experimental API and will be changed in future. warnings.warn( INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result: restart_count=0 master_addr=127.0.0.1 master_port=29500 group_rank=0 group_world_size=1 local_ranks=[0, 1, 2, 3, 4, 5, 6, 7] role_ranks=[0, 1, 2, 3, 4, 5, 6, 7] global_ranks=[0, 1, 2, 3, 4, 5, 6, 7] role_world_sizes=[8, 8, 8, 8, 8, 8, 8, 8] global_world_sizes=[8, 8, 8, 8, 8, 8, 8, 8]
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_0/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_0/1/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_0/2/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_0/3/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker4 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_0/4/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker5 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_0/5/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker6 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_0/6/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker7 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_0/7/error.json
Traceback (most recent call last):
File "tools/train.py", line 23, in
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_1/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_1/1/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_1/2/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_1/3/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker4 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_1/4/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker5 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_1/5/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker6 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_1/6/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker7 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_1/7/error.json
Traceback (most recent call last):
File "tools/train.py", line 23, in
from mmdet3d.datasets import build_dataset
SystemError File "/home/sajadi/anaconda3/envs/streampetr/mmdetection3d/mmdet3d/datasets/init.py", line 4, in
File "/home/sajadi/anaconda3/envs/streampetr/mmdetection3d/mmdet3d/datasets/init.py", line 4, in
File "/home/sajadi/anaconda3/envs/streampetr/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 10, in
File "/home/sajadi/anaconda3/envs/streampetr/mmdetection3d/mmdet3d/core/init.py", line 4, in
File "/home/sajadi/anaconda3/envs/streampetr/mmdetection3d/mmdet3d/core/evaluation/init.py", line 4, in
File "/home/sajadi/anaconda3/envs/streampetr/mmdetection3d/mmdet3d/core/evaluation/kitti_utils/init.py", line 2, in
File "/home/sajadi/anaconda3/envs/streampetr/mmdetection3d/mmdet3d/core/evaluation/kitti_utils/eval.py", line 5, in
File "/home/sajadi/anaconda3/envs/streampetr/lib/python3.8/site-packages/numba/init.py", line 43, in
File "/home/sajadi/anaconda3/envs/streampetr/lib/python3.8/site-packages/numba/np/ufunc/decorators.py", line 3, in
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_2/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_2/1/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_2/2/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_2/3/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker4 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_2/4/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker5 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_2/5/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker6 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_2/6/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker7 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_2/7/error.json
Traceback (most recent call last):
File "tools/train.py", line 23, in
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_3/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_3/1/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_3/2/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_3/3/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker4 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_3/4/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker5 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_3/5/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker6 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_3/6/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker7 reply file to: /tmp/torchelastic_s3vw2v1o/none_6_0_ic2t/attempt_3/7/error.json
Traceback (most recent call last):
File "tools/train.py", line 23, in
CHILD PROCESS FAILED WITH NO ERROR_FILE
CHILD PROCESS FAILED WITH NO ERROR_FILE Child process 11373 (local_rank 0) FAILED (exitcode 1) Error msg: Process failed with exitcode 1 Without writing an error file to <N/A>. While this DOES NOT affect the correctness of your application, no trace information about the error will be available for inspection. Consider decorating your top level entrypoint function with torch.distributed.elastic.multiprocessing.errors.record. Example:
from torch.distributed.elastic.multiprocessing.errors import record
@record def trainer_main(args):
warnings.warn(_no_error_file_warning_msg(rank, failure))
Traceback (most recent call last):
File "/home/sajadi/anaconda3/envs/streampetr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/sajadi/anaconda3/envs/streampetr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/sajadi/anaconda3/envs/streampetr/lib/python3.8/site-packages/torch/distributed/launch.py", line 173, in
tools/train.py FAILED
Other Failures: [1]: time: 2023-05-26_06:59:35 rank: 1 (local_rank: 1) exitcode: 1 (pid: 11374) error_file: <N/A> msg: "Process failed with exitcode 1" [2]: time: 2023-05-26_06:59:35 rank: 2 (local_rank: 2) exitcode: 1 (pid: 11375) error_file: <N/A> msg: "Process failed with exitcode 1" [3]: time: 2023-05-26_06:59:35 rank: 3 (local_rank: 3) exitcode: 1 (pid: 11376) error_file: <N/A> msg: "Process failed with exitcode 1" [4]: time: 2023-05-26_06:59:35 rank: 4 (local_rank: 4) exitcode: 1 (pid: 11377) error_file: <N/A> msg: "Process failed with exitcode 1" [5]: time: 2023-05-26_06:59:35 rank: 5 (local_rank: 5) exitcode: 1 (pid: 11378) error_file: <N/A> msg: "Process failed with exitcode 1" [6]: time: 2023-05-26_06:59:35 rank: 6 (local_rank: 6) exitcode: 1 (pid: 11379) error_file: <N/A> msg: "Process failed with exitcode 1" [7]: time: 2023-05-26_06:59:35 rank: 7 (local_rank: 7) exitcode: 1 (pid: 11380) error_file: <N/A> msg: "Process failed with exitcode 1"
(streampetr) sajadi@sajadi:~/anaconda3/envs/streampetr/StreamPETR$
Package Version Editable project location
absl-py 1.4.0 addict 2.4.0 anyio 3.6.2 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 arrow 1.2.3 asttokens 2.2.1 attrs 23.1.0 backcall 0.2.0 beautifulsoup4 4.12.2 black 23.3.0 bleach 6.0.0 cachetools 5.3.0 certifi 2023.5.7 cffi 1.15.1 charset-normalizer 3.1.0 click 8.1.3 comm 0.1.3 contourpy 1.0.7 cycler 0.11.0 debugpy 1.6.7 decorator 5.1.1 defusedxml 0.7.1 descartes 1.1.0 einops 0.6.1 exceptiongroup 1.1.1 executing 1.2.0 fastjsonschema 2.17.1 fire 0.5.0 flake8 6.0.0 flash-attn 0.2.2 fonttools 4.39.4 fqdn 1.5.1 google-auth 2.18.1 google-auth-oauthlib 1.0.0 grpcio 1.54.2 idna 3.4 imageio 2.29.0 importlib-metadata 6.6.0 importlib-resources 5.12.0 iniconfig 2.0.0 ipykernel 6.23.1 ipython 8.12.2 ipython-genutils 0.2.0 ipywidgets 8.0.6 isoduration 20.11.0 jedi 0.18.2 Jinja2 3.1.2 joblib 1.2.0 jsonpointer 2.3 jsonschema 4.17.3 jupyter 1.0.0 jupyter_client 8.2.0 jupyter-console 6.6.3 jupyter_core 5.3.0 jupyter-events 0.6.3 jupyter_server 2.5.0 jupyter_server_terminals 0.4.4 jupyterlab-pygments 0.2.2 jupyterlab-widgets 3.0.7 kiwisolver 1.4.4 llvmlite 0.36.0 lyft-dataset-sdk 0.0.8 Markdown 3.4.3 MarkupSafe 2.1.2 matplotlib 3.5.2 matplotlib-inline 0.1.6 mccabe 0.7.0 mistune 2.0.5 mmcls 0.25.0 mmcv-full 1.6.0 mmdet 2.28.2 mmdet3d 1.0.0rc6 /home/sajadi/anaconda3/envs/streampetr/mmdetection3d mmsegmentation 0.30.0 mypy-extensions 1.0.0 nbclassic 1.0.0 nbclient 0.8.0 nbconvert 7.4.0 nbformat 5.8.0 nest-asyncio 1.5.6 networkx 2.2 notebook 6.5.4 notebook_shim 0.2.3 numba 0.53.0 numpy 1.24.3 nuscenes-devkit 1.1.10 oauthlib 3.2.2 opencv-python 4.7.0.72 packaging 23.1 pandas 2.0.1 pandocfilters 1.5.0 parso 0.8.3 pathspec 0.11.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.5.0 pip 23.0.1 pkgutil_resolve_name 1.3.10 platformdirs 3.5.1 plotly 5.14.1 pluggy 1.0.0 plyfile 0.9 prettytable 3.7.0 prometheus-client 0.16.0 prompt-toolkit 3.0.38 protobuf 4.23.1 psutil 5.9.5 ptyprocess 0.7.0 pure-eval 0.2.2 pyasn1 0.5.0 pyasn1-modules 0.3.0 pycocotools 2.0.6 pycodestyle 2.10.0 pycparser 2.21 pyflakes 3.0.1 Pygments 2.15.1 pyparsing 3.0.9 pyquaternion 0.9.9 pyrsistent 0.19.3 pytest 7.3.1 python-dateutil 2.8.2 python-json-logger 2.0.7 pytz 2023.3 PyWavelets 1.4.1 PyYAML 6.0 pyzmq 25.0.2 qtconsole 5.4.3 QtPy 2.3.1 requests 2.31.0 requests-oauthlib 1.3.1 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rsa 4.9 scikit-image 0.19.3 scikit-learn 1.2.2 scipy 1.10.1 Send2Trash 1.8.2 setuptools 66.0.0 Shapely 1.8.5 six 1.16.0 sniffio 1.3.0 soupsieve 2.4.1 stack-data 0.6.2 tenacity 8.2.2 tensorboard 2.13.0 tensorboard-data-server 0.7.0 termcolor 2.3.0 terminado 0.17.1 terminaltables 3.1.10 threadpoolctl 3.1.0 tifffile 2023.4.12 tinycss2 1.2.1 tomli 2.0.1 torch 1.9.0+cu111 torchaudio 0.9.0 torchvision 0.10.0+cu111 tornado 6.3.2 tqdm 4.65.0 traitlets 5.9.0 trimesh 2.35.39 typing_extensions 4.5.0 tzdata 2023.3 uri-template 1.2.0 urllib3 1.26.16 wcwidth 0.2.6 webcolors 1.13 webencodings 0.5.1 websocket-client 1.5.2 Werkzeug 2.3.4 wheel 0.38.4 widgetsnbextension 4.0.7 yapf 0.33.0 zipp 3.15.0
@SMSajadi99 Try this command pip install "numpy<1.24.0" My numpy version is 1.23.3
Yes, it was done.
Of course, I was running when I made the mistake of saying:
nuscenes2d_temporal_infos_val.pkl
They don't exist, now I changed all the parts that needed this item in the codes to the following text:
nuscenes2d_temporal_infos_val_mini.pkl
But I ran into the following problem:
(streampetr) sajadi@sajadi:~/anaconda3/envs/streampetr/StreamPETR$ tools/dist_train.sh projects/configs/StreamPETR/stream_petr_r50_flash_704_bs2_seq_24e.py 1 --work-dir work_dirs/stream_petr_r50_flash_704_bs2_seq_24e/ /home/sajadi/anaconda3/envs/streampetr/lib/python3.8/site-packages/torch/distributed/launch.py:163: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead logger.warn( The module torch.distributed.launch is deprecated and going to be removed in future.Migrate to torch.distributed.run INFO:torch.distributed.launcher.api:Starting elastic_operator with launch configs: entrypoint : tools/train.py min_nodes : 1 max_nodes : 1 nproc_per_node : 1 run_id : none rdzv_backend : static rdzv_endpoint : 127.0.0.1:29500 rdzv_configs : {'rank': 0, 'timeout': 900} max_restarts : 3 monitor_interval : 5 log_dir : None metrics_cfg : {}
INFO:torch.distributed.elastic.agent.server.local_elastic_agent:log directory set to: /tmp/torchelastic_3ywdtqmw/none_p36ahq8n INFO:torch.distributed.elastic.agent.server.api:[default] starting workers for entrypoint: python INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group /home/sajadi/anaconda3/envs/streampetr/lib/python3.8/site-packages/torch/distributed/elastic/utils/store.py:52: FutureWarning: This is an experimental API and will be changed in future. warnings.warn( INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result: restart_count=0 master_addr=127.0.0.1 master_port=29500 group_rank=0 group_world_size=1 local_ranks=[0] role_ranks=[0] global_ranks=[0] role_world_sizes=[1] global_world_sizes=[1]
sys.platform: linux Python: 3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0] CUDA available: True GPU 0: NVIDIA GeForce GT 1030 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.2, V11.2.67 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 1.9.0+cu111 PyTorch compiling details: PyTorch built with:
2023-05-26 08:10:48,852 - mmdet - INFO - Distributed training: True 2023-05-26 08:10:49,852 - mmdet - INFO - Config: point_cloud_range = [-51.2, -51.2, -5.0, 51.2, 51.2, 3.0] class_names = [ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ] dataset_type = 'CustomNuScenesDataset' data_root = './data/nuscenes/' input_modality = dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=True) file_client_args = dict(backend='disk') train_pipeline = [ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True, with_bbox=True, with_label=True, with_bbox_depth=True), dict( type='ObjectRangeFilter', point_cloud_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0]), dict( type='ObjectNameFilter', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ]), dict( type='ResizeCropFlipRotImage', data_aug_conf=dict( resize_lim=(0.38, 0.55), final_dim=(256, 704), bot_pct_lim=(0.0, 0.0), rot_lim=(0.0, 0.0), H=900, W=1600, rand_flip=True), training=True), dict( type='GlobalRotScaleTransImage', rot_range=[-0.3925, 0.3925], translation_std=[0, 0, 0], scale_ratio_range=[0.95, 1.05], reverse_angle=True, training=True), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='PadMultiViewImage', size_divisor=32), dict( type='PETRFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv', 'prev_exists' ]), dict( type='Collect3D', keys=[ 'gt_bboxes_3d', 'gt_labels_3d', 'img', 'gt_bboxes', 'gt_labels', 'centers2d', 'depths', 'prev_exists', 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'box_mode_3d', 'box_type_3d', 'img_norm_cfg', 'scene_token', 'gt_bboxes_3d', 'gt_labels_3d')) ] test_pipeline = [ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='ResizeCropFlipRotImage', data_aug_conf=dict( resize_lim=(0.38, 0.55), final_dim=(256, 704), bot_pct_lim=(0.0, 0.0), rot_lim=(0.0, 0.0), H=900, W=1600, rand_flip=True), training=False), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='PadMultiViewImage', size_divisor=32), dict( type='MultiScaleFlipAug3D', img_scale=(1333, 800), pts_scale_ratio=1, flip=False, transforms=[ dict( type='PETRFormatBundle3D', collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict( type='Collect3D', keys=[ 'img', 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'box_mode_3d', 'box_type_3d', 'img_norm_cfg', 'scene_token')) ]) ] eval_pipeline = [ dict( type='LoadPointsFromFile', coord_type='LIDAR', load_dim=5, use_dim=5, file_client_args=dict(backend='disk')), dict( type='LoadPointsFromMultiSweeps', sweeps_num=10, file_client_args=dict(backend='disk')), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle', 'motorcycle', 'pedestrian', 'traffic_cone', 'barrier' ], with_label=False), dict(type='Collect3D', keys=['points']) ] data = dict( samples_per_gpu=2, workers_per_gpu=4, train=dict( type='CustomNuScenesDataset', data_root='./data/nuscenes/', ann_file='./data/nuscenes/nuscenes2d_temporal_infos_train_mini.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True, with_bbox=True, with_label=True, with_bbox_depth=True), dict( type='ObjectRangeFilter', point_cloud_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0]), dict( type='ObjectNameFilter', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ]), dict( type='ResizeCropFlipRotImage', data_aug_conf=dict( resize_lim=(0.38, 0.55), final_dim=(256, 704), bot_pct_lim=(0.0, 0.0), rot_lim=(0.0, 0.0), H=900, W=1600, rand_flip=True), training=True), dict( type='GlobalRotScaleTransImage', rot_range=[-0.3925, 0.3925], translation_std=[0, 0, 0], scale_ratio_range=[0.95, 1.05], reverse_angle=True, training=True), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='PadMultiViewImage', size_divisor=32), dict( type='PETRFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv', 'prev_exists' ]), dict( type='Collect3D', keys=[ 'gt_bboxes_3d', 'gt_labels_3d', 'img', 'gt_bboxes', 'gt_labels', 'centers2d', 'depths', 'prev_exists', 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'box_mode_3d', 'box_type_3d', 'img_norm_cfg', 'scene_token', 'gt_bboxes_3d', 'gt_labels_3d')) ], classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=True), test_mode=False, box_type_3d='LiDAR', num_frame_losses=1, seq_split_num=2, seq_mode=True, collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv', 'img', 'prev_exists', 'img_metas' ], queue_length=1, use_valid_flag=True, filter_empty_gt=False), val=dict( type='CustomNuScenesDataset', data_root='data/nuscenes/', ann_file='./data/nuscenes/nuscenes2d_temporal_infos_val_mini.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='ResizeCropFlipRotImage', data_aug_conf=dict( resize_lim=(0.38, 0.55), final_dim=(256, 704), bot_pct_lim=(0.0, 0.0), rot_lim=(0.0, 0.0), H=900, W=1600, rand_flip=True), training=False), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='PadMultiViewImage', size_divisor=32), dict( type='MultiScaleFlipAug3D', img_scale=(1333, 800), pts_scale_ratio=1, flip=False, transforms=[ dict( type='PETRFormatBundle3D', collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict( type='Collect3D', keys=[ 'img', 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'box_mode_3d', 'box_type_3d', 'img_norm_cfg', 'scene_token')) ]) ], classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=True), test_mode=True, box_type_3d='LiDAR', collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv', 'img', 'img_metas' ], queue_length=1), test=dict( type='CustomNuScenesDataset', data_root='data/nuscenes/', ann_file='./data/nuscenes/nuscenes2d_temporal_infos_val_mini.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='ResizeCropFlipRotImage', data_aug_conf=dict( resize_lim=(0.38, 0.55), final_dim=(256, 704), bot_pct_lim=(0.0, 0.0), rot_lim=(0.0, 0.0), H=900, W=1600, rand_flip=True), training=False), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='PadMultiViewImage', size_divisor=32), dict( type='MultiScaleFlipAug3D', img_scale=(1333, 800), pts_scale_ratio=1, flip=False, transforms=[ dict( type='PETRFormatBundle3D', collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict( type='Collect3D', keys=[ 'img', 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'box_mode_3d', 'box_type_3d', 'img_norm_cfg', 'scene_token')) ]) ], classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=True), test_mode=True, box_type_3d='LiDAR', collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv', 'img', 'img_metas' ], queue_length=1), shuffler_sampler=dict(type='InfiniteGroupEachSampleInBatchSampler'), nonshuffler_sampler=dict(type='DistributedSampler')) evaluation = dict( interval=42192, pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='ResizeCropFlipRotImage', data_aug_conf=dict( resize_lim=(0.38, 0.55), final_dim=(256, 704), bot_pct_lim=(0.0, 0.0), rot_lim=(0.0, 0.0), H=900, W=1600, rand_flip=True), training=False), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='PadMultiViewImage', size_divisor=32), dict( type='MultiScaleFlipAug3D', img_scale=(1333, 800), pts_scale_ratio=1, flip=False, transforms=[ dict( type='PETRFormatBundle3D', collect_keys=[ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict( type='Collect3D', keys=[ 'img', 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ], meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'box_mode_3d', 'box_type_3d', 'img_norm_cfg', 'scene_token')) ]) ]) checkpoint_config = dict(interval=1758, max_keep_ckpts=3) log_config = dict( interval=50, hooks=[dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = 'work_dirs/stream_petr_r50_flash_704_bs2_seq_24e/' load_from = None resume_from = None workflow = [('train', 1)] opencv_num_threads = 0 mp_start_method = 'fork' backbone_norm_cfg = dict(type='LN', requires_grad=True) plugin = True plugin_dir = 'projects/mmdet3d_plugin/' voxel_size = [0.2, 0.2, 8] img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) num_gpus = 8 batch_size = 2 num_iters_per_epoch = 1758 num_epochs = 24 queue_length = 1 num_frame_losses = 1 collect_keys = [ 'lidar2img', 'intrinsics', 'extrinsics', 'timestamp', 'img_timestamp', 'ego_pose', 'ego_pose_inv' ] model = dict( type='Petr3D', num_frame_head_grads=1, num_frame_backbone_grads=1, num_frame_losses=1, use_grid_mask=True, img_backbone=dict( pretrained='torchvision://resnet50', type='ResNet', depth=50, num_stages=4, out_indices=(2, 3), frozen_stages=-1, norm_cfg=dict(type='BN2d', requires_grad=False), norm_eval=True, with_cp=True, style='pytorch'), img_neck=dict( type='CPFPN', in_channels=[1024, 2048], out_channels=256, num_outs=2), img_roi_head=dict( type='FocalHead', num_classes=10, in_channels=256, loss_cls2d=dict( type='QualityFocalLoss', use_sigmoid=True, beta=2.0, loss_weight=2.0), loss_centerness=dict( type='GaussianFocalLoss', reduction='mean', loss_weight=1.0), loss_bbox2d=dict(type='L1Loss', loss_weight=5.0), loss_iou2d=dict(type='GIoULoss', loss_weight=2.0), loss_centers2d=dict(type='L1Loss', loss_weight=10.0), train_cfg=dict( assigner2d=dict( type='HungarianAssigner2D', cls_cost=dict(type='FocalLossCost', weight=2.0), reg_cost=dict( type='BBoxL1Cost', weight=5.0, box_format='xywh'), iou_cost=dict(type='IoUCost', iou_mode='giou', weight=2.0), centers2d_cost=dict(type='BBox3DL1Cost', weight=10.0)))), pts_bbox_head=dict( type='StreamPETRHead', num_classes=10, in_channels=256, num_query=644, memory_len=1024, topk_proposals=256, num_propagated=256, with_ego_pos=True, match_with_velo=False, scalar=10, noise_scale=1.0, dn_weight=1.0, split=0.75, LID=True, with_position=True, position_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0], code_weights=[2.0, 2.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], transformer=dict( type='PETRTemporalTransformer', decoder=dict( type='PETRTransformerDecoder', return_intermediate=True, num_layers=6, transformerlayers=dict( type='PETRTemporalDecoderLayer', attn_cfgs=[ dict( type='MultiheadAttention', embed_dims=256, num_heads=8, dropout=0.1), dict( type='PETRMultiheadFlashAttention', embed_dims=256, num_heads=8, dropout=0.1) ], feedforward_channels=2048, ffn_dropout=0.1, with_cp=True, operation_order=('self_attn', 'norm', 'cross_attn', 'norm', 'ffn', 'norm')))), bbox_coder=dict( type='NMSFreeCoder', post_center_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0], pc_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0], max_num=300, voxel_size=[0.2, 0.2, 8], num_classes=10), loss_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=2.0), loss_bbox=dict(type='L1Loss', loss_weight=0.25), loss_iou=dict(type='GIoULoss', loss_weight=0.0)), train_cfg=dict( pts=dict( grid_size=[512, 512, 1], voxel_size=[0.2, 0.2, 8], point_cloud_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0], out_size_factor=4, assigner=dict( type='HungarianAssigner3D', cls_cost=dict(type='FocalLossCost', weight=2.0), reg_cost=dict(type='BBox3DL1Cost', weight=0.25), iou_cost=dict(type='IoUCost', weight=0.0), pc_range=[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0])))) ida_aug_conf = dict( resize_lim=(0.38, 0.55), final_dim=(256, 704), bot_pct_lim=(0.0, 0.0), rot_lim=(0.0, 0.0), H=900, W=1600, rand_flip=True) optimizer = dict( type='AdamW', lr=0.0004, paramwise_cfg=dict(custom_keys=dict(img_backbone=dict(lr_mult=0.25))), weight_decay=0.01) optimizer_config = dict( type='Fp16OptimizerHook', loss_scale='dynamic', grad_clip=dict(max_norm=35, norm_type=2)) lr_config = dict( policy='CosineAnnealing', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, min_lr_ratio=0.001) find_unused_parameters = False runner = dict(type='IterBasedRunner', max_iters=42192) gpu_ids = range(0, 1)
2023-05-26 08:10:49,853 - mmdet - INFO - Set random seed to 0, deterministic: False /home/sajadi/anaconda3/envs/streampetr/lib/python3.8/site-packages/mmdet/models/backbones/resnet.py:401: UserWarning: DeprecationWarning: pretrained is deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is deprecated, ' 2023-05-26 08:10:50,093 - mmdet - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'torchvision://resnet50'} 2023-05-26 08:10:50,093 - mmcv - INFO - load model from: torchvision://resnet50 2023-05-26 08:10:50,093 - mmcv - INFO - load checkpoint from torchvision path: torchvision://resnet50 2023-05-26 08:10:50,146 - mmcv - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
2023-05-26 08:10:50,160 - mmdet - INFO - initialize CPFPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-05-26 08:10:50,171 - mmdet - INFO - Model:
Petr3D(
(pts_bbox_head): StreamPETRHead(
(loss_cls): FocalLoss()
(loss_bbox): L1Loss()
(cls_branches): ModuleList(
(0): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=256, out_features=256, bias=True)
(4): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=10, bias=True)
)
(1): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=256, out_features=256, bias=True)
(4): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=10, bias=True)
)
(2): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=256, out_features=256, bias=True)
(4): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=10, bias=True)
)
(3): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=256, out_features=256, bias=True)
(4): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=10, bias=True)
)
(4): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=256, out_features=256, bias=True)
(4): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=10, bias=True)
)
(5): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=256, out_features=256, bias=True)
(4): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=10, bias=True)
)
)
(reg_branches): ModuleList(
(0): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=10, bias=True)
)
(1): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=10, bias=True)
)
(2): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=10, bias=True)
)
(3): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=10, bias=True)
)
(4): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=10, bias=True)
)
(5): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=10, bias=True)
)
)
(position_encoder): Sequential(
(0): Linear(in_features=192, out_features=1024, bias=True)
(1): ReLU()
(2): Linear(in_features=1024, out_features=256, bias=True)
)
(memory_embed): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
)
(featurized_pe): SELayer_Linear(
(conv_reduce): Linear(in_features=256, out_features=256, bias=True)
(act1): ReLU()
(conv_expand): Linear(in_features=256, out_features=256, bias=True)
(gate): Sigmoid()
)
(reference_points): Embedding(644, 3)
(pseudo_reference_points): Embedding(256, 3)
(query_embedding): Sequential(
(0): Linear(in_features=384, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
)
(spatial_alignment): MLN(
(reduce): Sequential(
(0): Linear(in_features=8, out_features=256, bias=True)
(1): ReLU()
)
(gamma): Linear(in_features=256, out_features=256, bias=True)
(beta): Linear(in_features=256, out_features=256, bias=True)
(ln): LayerNorm((256,), eps=1e-05, elementwise_affine=False)
)
(time_embedding): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
(ego_pose_pe): MLN(
(reduce): Sequential(
(0): Linear(in_features=180, out_features=256, bias=True)
(1): ReLU()
)
(gamma): Linear(in_features=256, out_features=256, bias=True)
(beta): Linear(in_features=256, out_features=256, bias=True)
(ln): LayerNorm((256,), eps=1e-05, elementwise_affine=False)
)
(ego_pose_memory): MLN(
(reduce): Sequential(
(0): Linear(in_features=180, out_features=256, bias=True)
(1): ReLU()
)
(gamma): Linear(in_features=256, out_features=256, bias=True)
(beta): Linear(in_features=256, out_features=256, bias=True)
(ln): LayerNorm((256,), eps=1e-05, elementwise_affine=False)
)
(loss_iou): GIoULoss()
(transformer): PETRTemporalTransformer(
(decoder): PETRTransformerDecoder(
(layers): ModuleList(
(0): PETRTemporalDecoderLayer(
(attentions): ModuleList(
(0): MultiheadAttention(
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
(1): PETRMultiheadFlashAttention(
(attn): FlashMHA(
(inner_attn): FlashAttention()
(out_proj): Linear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
)
(ffns): ModuleList(
(0): FFN(
(activate): ReLU(inplace=True)
(layers): Sequential(
(0): Sequential(
(0): Linear(in_features=256, out_features=2048, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.1, inplace=False)
)
(1): Linear(in_features=2048, out_features=256, bias=True)
(2): Dropout(p=0.1, inplace=False)
)
(dropout_layer): Identity()
)
)
(norms): ModuleList(
(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
)
(1): PETRTemporalDecoderLayer(
(attentions): ModuleList(
(0): MultiheadAttention(
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
(1): PETRMultiheadFlashAttention(
(attn): FlashMHA(
(inner_attn): FlashAttention()
(out_proj): Linear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
)
(ffns): ModuleList(
(0): FFN(
(activate): ReLU(inplace=True)
(layers): Sequential(
(0): Sequential(
(0): Linear(in_features=256, out_features=2048, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.1, inplace=False)
)
(1): Linear(in_features=2048, out_features=256, bias=True)
(2): Dropout(p=0.1, inplace=False)
)
(dropout_layer): Identity()
)
)
(norms): ModuleList(
(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
)
(2): PETRTemporalDecoderLayer(
(attentions): ModuleList(
(0): MultiheadAttention(
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
(1): PETRMultiheadFlashAttention(
(attn): FlashMHA(
(inner_attn): FlashAttention()
(out_proj): Linear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
)
(ffns): ModuleList(
(0): FFN(
(activate): ReLU(inplace=True)
(layers): Sequential(
(0): Sequential(
(0): Linear(in_features=256, out_features=2048, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.1, inplace=False)
)
(1): Linear(in_features=2048, out_features=256, bias=True)
(2): Dropout(p=0.1, inplace=False)
)
(dropout_layer): Identity()
)
)
(norms): ModuleList(
(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
)
(3): PETRTemporalDecoderLayer(
(attentions): ModuleList(
(0): MultiheadAttention(
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
(1): PETRMultiheadFlashAttention(
(attn): FlashMHA(
(inner_attn): FlashAttention()
(out_proj): Linear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
)
(ffns): ModuleList(
(0): FFN(
(activate): ReLU(inplace=True)
(layers): Sequential(
(0): Sequential(
(0): Linear(in_features=256, out_features=2048, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.1, inplace=False)
)
(1): Linear(in_features=2048, out_features=256, bias=True)
(2): Dropout(p=0.1, inplace=False)
)
(dropout_layer): Identity()
)
)
(norms): ModuleList(
(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
)
(4): PETRTemporalDecoderLayer(
(attentions): ModuleList(
(0): MultiheadAttention(
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
(1): PETRMultiheadFlashAttention(
(attn): FlashMHA(
(inner_attn): FlashAttention()
(out_proj): Linear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
)
(ffns): ModuleList(
(0): FFN(
(activate): ReLU(inplace=True)
(layers): Sequential(
(0): Sequential(
(0): Linear(in_features=256, out_features=2048, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.1, inplace=False)
)
(1): Linear(in_features=2048, out_features=256, bias=True)
(2): Dropout(p=0.1, inplace=False)
)
(dropout_layer): Identity()
)
)
(norms): ModuleList(
(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
)
(5): PETRTemporalDecoderLayer(
(attentions): ModuleList(
(0): MultiheadAttention(
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
(1): PETRMultiheadFlashAttention(
(attn): FlashMHA(
(inner_attn): FlashAttention()
(out_proj): Linear(in_features=256, out_features=256, bias=True)
)
(proj_drop): Dropout(p=0.0, inplace=False)
(dropout_layer): Dropout(p=0.1, inplace=False)
)
)
(ffns): ModuleList(
(0): FFN(
(activate): ReLU(inplace=True)
(layers): Sequential(
(0): Sequential(
(0): Linear(in_features=256, out_features=2048, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.1, inplace=False)
)
(1): Linear(in_features=2048, out_features=256, bias=True)
(2): Dropout(p=0.1, inplace=False)
)
(dropout_layer): Identity()
)
)
(norms): ModuleList(
(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
)
)
(post_norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
)
)
(img_backbone): ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): ResLayer(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer2): ResLayer(
(0): Bottleneck(
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer3): ResLayer(
(0): Bottleneck(
(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(4): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(5): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer4): ResLayer(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
)
init_cfg={'type': 'Pretrained', 'checkpoint': 'torchvision://resnet50'}
(img_neck): CPFPN(
(lateral_convs): ModuleList(
(0): ConvModule(
(conv): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
)
(1): ConvModule(
(conv): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
)
)
(fpn_convs): ModuleList(
(0): ConvModule(
(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
)
init_cfg={'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
(img_roi_head): FocalHead(
(loss_cls): FocalLoss()
(loss_bbox): IoULoss()
(cls): Conv2d(256, 10, kernel_size=(1, 1), stride=(1, 1))
(shared_reg): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): GroupNorm(32, 256, eps=1e-05, affine=True)
(2): ReLU()
)
(shared_cls): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): GroupNorm(32, 256, eps=1e-05, affine=True)
(2): ReLU()
)
(centerness): Conv2d(256, 1, kernel_size=(1, 1), stride=(1, 1))
(ltrb): Conv2d(256, 4, kernel_size=(1, 1), stride=(1, 1))
(center2d): Conv2d(256, 2, kernel_size=(1, 1), stride=(1, 1))
(loss_cls2d): QualityFocalLoss()
(loss_bbox2d): L1Loss()
(loss_iou2d): GIoULoss()
(loss_centers2d): L1Loss()
(loss_centerness): GaussianFocalLoss()
)
(grid_mask): GridMask()
)
2023-05-26 08:10:53,022 - mmdet - INFO - Start running, host: sajadi@sajadi, work_dir: /home/sajadi/anaconda3/envs/streampetr/StreamPETR/work_dirs/stream_petr_r50_flash_704_bs2_seq_24e
2023-05-26 08:10:53,022 - mmdet - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(ABOVE_NORMAL) Fp16OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
before_train_epoch:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
before_train_iter:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
after_train_iter:
(ABOVE_NORMAL) Fp16OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
after_train_epoch:
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
before_val_epoch:
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
before_val_iter: (LOW ) IterTimerHook
after_val_iter: (LOW ) IterTimerHook
after_val_epoch:
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
after_run:
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
2023-05-26 08:10:53,022 - mmdet - INFO - workflow: [('train', 1)], max: 42192 iters
2023-05-26 08:10:53,024 - mmdet - INFO - Checkpoints will be saved to /home/sajadi/anaconda3/envs/streampetr/StreamPETR/work_dirs/stream_petr_r50_flash_704_bs2_seq_24e by HardDiskBackend.
Traceback (most recent call last):
File "tools/train.py", line 263, in
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_3ywdtqmw/none_p36ahq8n/attempt_1/0/error.json projects.mmdet3d_plugin
Unfortunately, the process does not end
Unfortunately, the process does not end
setuptools version: 45.2.0
Is this version correct?
Because it gives an error related to the same issue:
ModuleNotFoundError: No module named '_distutils_hack'
I searched:
https://stackoverflow.com/questions/73496322/modulenotfounderror-no-module-named-distutils-hack
Is this version correct? Because it gives an error related to the same issue:
ModuleNotFoundError: No module named '_distutils_hack'
I searched: https://stackoverflow.com/questions/73496322/modulenotfounderror-no-module-named-distutils-hack
It seems that you should uninstall the old version of setuptools. And then install the version of 45.2.0
Is this version correct? Because it gives an error related to the same issue:
ModuleNotFoundError: No module named '_distutils_hack'
I searched: https://stackoverflow.com/questions/73496322/modulenotfounderror-no-module-named-distutils-hack
Hi, I need to sleep now… I will answer you in tomorrow if your issue is not solved.
Thank you for your help. Yes, this problem has been solved, but it seems that the CUDA memory is low, which does not perform the refinement: RuntimeError: CUDA out of memory. Tried to allocate 66.00 MiB (GPU 0; 1.95 GiB total capacity; 728.34 MiB already allocated; 27.56 MiB free; 810.00 MiB reserved in total by PyTorch) ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 21265) of binary: /home/sajadi/anaconda3/envs/streampetr/bin/python ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed INFO:torch.distributed.elastic.agent.server.api:[default] Worker group FAILED. 3/3 attempts left; will restart worker group INFO:torch.distributed.elastic.agent.server.api:[default] Stopping worker group INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result: restart_count=1 master_addr=127.0.0.1 master_port=29500 group_rank=0 group_world_size=1 local_ranks=[0] role_ranks=[0] global_ranks=[0] role_world_sizes=[1] global_world_sizes=[1]
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_f12peicb/none_ycf7las6/attempt_1/0/error.json projects.mmdet3d_plugin
Hello again @exiawsh I searched for this lack issue but there was no good way that definitely answers, maybe you can guide me
Hello again @exiawsh I searched for this lack issue but there was no good way that definitely answers, maybe you can guide me
What's your gpu device and its total memory? I think you should have at least 6G gpu memory per device.
https://github.com/exiawsh/StreamPETR/issues/19#issuecomment-1564442516 Does that mean I can't change the batch size or something else so that I can run it?
#19 (comment) Does that mean I can't change the batch size or something else so that I can run it?
Try to set the batchsize to 1. Your gpu device is too old...
#19 (comment) Does that mean I can't change the batch size or something else so that I can run it?
Try to set the batchsize to 1. Your gpu device is too old...
Exactly. I did this now but it didn't make any difference and it still has the RAM problem. Is it possible for you to give me the folder called work_dirs?
Your gpu memory is not enough… Because it's only have 2G gpu memory… I have say thay you should have at least 6G gpu memory… Try google colab instead.
Your gpu memory is not enough… Because it's only have 2G gpu memory… I have say thay you should have at least 6G gpu memory… Try google colab instead.
Yes that's right.
Unfortunately, I had to work on this system (of course, I checked the club now, there is a bit of trouble to change its defaults), if possible, send me the files of this folder work_dirs
so that I can do the evaluation, and if it is successful, this I transfer the process to another system.
Please accept my request
Your gpu memory is not enough… Because it's only have 2G gpu memory… I have say thay you should have at least 6G gpu memory… Try google colab instead.
Yes that's right. Unfortunately, I had to work on this system (of course, I checked the club now, there is a bit of trouble to change its defaults), if possible, send me the files of this folder
work_dirs
so that I can do the evaluation, and if it is successful, this I transfer the process to another system. Please accept my request work_dirs is not necessary. Check our provided model checkpoint https://github.com/exiawsh/storage/releases/download/v1.0/stream_petr_vov_flash_800_bs2_seq_24e.pth.
Hello again
I tried to create this item on Kolb, but I ran into a problem with the flash-attn
model:
What should I do to solve this problem because there is a problem with training? It gives an error because it is not the model
Hello everyone Thank you very much for your attractive project. As you said in the instructions, I went and installed the libraries and made the folders and put the
pkl
related to theV1.0-mini
version that I asked you about in the previous question in thedata/nuscenes
folder, but for train I had a problem and it doesn't start working. At first I will send the folder and after that I will send the error. Thank you for your help.. . .