open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
https://mmaction2.readthedocs.io
Apache License 2.0
4.3k stars 1.25k forks source link

Duplicate Info #927

Closed makecent closed 3 years ago

makecent commented 3 years ago
PYTHONPATH=$PWD:$PYTHONPATH mim train mmaction Using port 20250 for synchronization. 
Training command is python -m torch.distributed.launch --nproc_per_node=1 --master_port=20250 /home/louis/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/mmaction/tools/train.py configs/localization/apn/apn_prop_coral_r3dsony_32x4_10e_thumos14_rgb.py --launcher pytorch --validate. 
2021-06-14 12:23:05,426 - mmaction - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.7.10 (default, Jun  4 2021, 14:48:32) [GCC 7.5.0]
CUDA available: True
GPU 0,1: NVIDIA GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.3.r11.3/compiler.29920130_0
GCC: gcc (Ubuntu 8.4.0-3ubuntu2) 8.4.0
PyTorch: 1.8.1
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.2-Product Build 20210312 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.9.1
OpenCV: 4.5.2
MMCV: 1.3.6
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.2
MMAction2: 0.15.0+1bd0c72
------------------------------------------------------------

INFO:mmaction:Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.7.10 (default, Jun  4 2021, 14:48:32) [GCC 7.5.0]
CUDA available: True
GPU 0,1: NVIDIA GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.3.r11.3/compiler.29920130_0
GCC: gcc (Ubuntu 8.4.0-3ubuntu2) 8.4.0
PyTorch: 1.8.1
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.2-Product Build 20210312 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.9.1
OpenCV: 4.5.2
MMCV: 1.3.6
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.2
MMAction2: 0.15.0+1bd0c72
------------------------------------------------------------

2021-06-14 12:23:05,426 - mmaction - INFO - Distributed training: True
INFO:mmaction:Distributed training: True
2021-06-14 12:23:05,564 - mmaction - INFO - Config: custom_imports = dict(
    imports=['models', 'dataloader'], allow_failed_imports=False)
clip_len = 32
frame_interval = 4
model = dict(
    type='APN',
    backbone=dict(
        type='ResNet3d_sony',
        pretrained='checkpoints/r3d_sony/model_rgb.pth',
        modality='rgb'),
    cls_head=dict(
        type='APNHead',
        num_classes=1,
        in_channels=1024,
        output_type='coral',
        loss=dict(type='ApnCORALLoss', uncorrelated_progs='ignore'),
        spatial_type='avg3d',
        clip_len=32))
train_cfg = dict(untrimmed=False)
test_cfg = dict(untrimmed=True)
dataset_type = 'APNDataset'
data_root_train = ('data/thumos14/rawframes/train',
                   'data/thumos14/rawframes/val')
data_root_val = 'data/thumos14/rawframes/test'
ann_file_train = ('data/thumos14/ann_train.csv', 'data/thumos14/ann_val.csv')
ann_file_val = 'data/thumos14/ann_test.csv'
img_norm_cfg = dict(mean=[128, 128, 128], std=[128, 128, 128], to_bgr=False)
train_pipeline = [
    dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
    dict(type='LabelToOrdinal'),
    dict(type='RawFrameDecode'),
    dict(type='Resize', scale=(224, 224), keep_ratio=False),
    dict(type='Flip', flip_ratio=0.5),
    dict(
        type='Normalize',
        mean=[128, 128, 128],
        std=[128, 128, 128],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCTHW'),
    dict(
        type='Collect',
        keys=['imgs', 'progression_label', 'class_label'],
        meta_keys=()),
    dict(type='ToTensor', keys=['imgs', 'progression_label', 'class_label'])
]
val_pipeline = [
    dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
    dict(type='LabelToOrdinal'),
    dict(type='RawFrameDecode'),
    dict(type='Resize', scale=(224, 224), keep_ratio=False),
    dict(
        type='Normalize',
        mean=[128, 128, 128],
        std=[128, 128, 128],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCTHW'),
    dict(
        type='Collect',
        keys=['imgs', 'progression_label', 'class_label'],
        meta_keys=()),
    dict(type='ToTensor', keys=['imgs', 'progression_label', 'class_label'])
]
test_pipeline = [
    dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
    dict(type='RawFrameDecode'),
    dict(type='Resize', scale=(224, 224), keep_ratio=False),
    dict(
        type='Normalize',
        mean=[128, 128, 128],
        std=[128, 128, 128],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCTHW'),
    dict(type='Collect', keys=['imgs'], meta_keys=()),
    dict(type='ToTensor', keys=['imgs'])
]
data = dict(
    videos_per_gpu=10,
    workers_per_gpu=8,
    train=dict(
        type='APNDataset',
        ann_files=('data/thumos14/ann_train.csv', 'data/thumos14/ann_val.csv'),
        pipeline=[
            dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
            dict(type='LabelToOrdinal'),
            dict(type='RawFrameDecode'),
            dict(type='Resize', scale=(224, 224), keep_ratio=False),
            dict(type='Flip', flip_ratio=0.5),
            dict(
                type='Normalize',
                mean=[128, 128, 128],
                std=[128, 128, 128],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCTHW'),
            dict(
                type='Collect',
                keys=['imgs', 'progression_label', 'class_label'],
                meta_keys=()),
            dict(
                type='ToTensor',
                keys=['imgs', 'progression_label', 'class_label'])
        ],
        data_prefixes=('data/thumos14/rawframes/train',
                       'data/thumos14/rawframes/val'),
        filename_tmpl='img_{:05}.jpg',
        modality='RGB',
        unittest=False),
    val=dict(
        type='APNDataset',
        ann_files='data/thumos14/ann_test.csv',
        pipeline=[
            dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
            dict(type='LabelToOrdinal'),
            dict(type='RawFrameDecode'),
            dict(type='Resize', scale=(224, 224), keep_ratio=False),
            dict(
                type='Normalize',
                mean=[128, 128, 128],
                std=[128, 128, 128],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCTHW'),
            dict(
                type='Collect',
                keys=['imgs', 'progression_label', 'class_label'],
                meta_keys=()),
            dict(
                type='ToTensor',
                keys=['imgs', 'progression_label', 'class_label'])
        ],
        data_prefixes='data/thumos14/rawframes/test',
        filename_tmpl='img_{:05}.jpg',
        modality='RGB',
        unittest=False),
    test=dict(
        type='APNDataset',
        ann_files='data/thumos14/ann_test.csv',
        pipeline=[
            dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
            dict(type='RawFrameDecode'),
            dict(type='Resize', scale=(224, 224), keep_ratio=False),
            dict(
                type='Normalize',
                mean=[128, 128, 128],
                std=[128, 128, 128],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCTHW'),
            dict(type='Collect', keys=['imgs'], meta_keys=()),
            dict(type='ToTensor', keys=['imgs'])
        ],
        data_prefixes='data/thumos14/rawframes/test',
        filename_tmpl='img_{:05}.jpg',
        modality='RGB',
        untrimmed=True))
optimizer = dict(type='Adam', lr=0.0001)
optimizer_config = dict(grad_clip=None)
lr_config = dict(policy='fixed')
total_epochs = 10
evaluation = dict(
    interval=1,
    key_indicator='mae',
    metrics=['loss', 'mae'],
    results_component=('losses', 'progressions'),
    dataset_name='Val')
checkpoint_config = dict(interval=1)
log_config = dict(
    interval=500,
    hooks=[dict(type='TensorboardLoggerHook'),
           dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/apn_prop_coral_r3dsony_32x4_10e_thumos14_rgb/'
load_from = None
resume_from = None
workflow = [('train', 1)]
output_config = dict(
    out='./work_dirs/apn_prop_coral_r3dsony_32x4_10e_thumos14_rgb//results.json'
)
gpu_ids = range(0, 1)
omnisource = False
module_hooks = []

INFO:mmaction:Config: custom_imports = dict(
    imports=['models', 'dataloader'], allow_failed_imports=False)
clip_len = 32
frame_interval = 4
model = dict(
    type='APN',
    backbone=dict(
        type='ResNet3d_sony',
        pretrained='checkpoints/r3d_sony/model_rgb.pth',
        modality='rgb'),
    cls_head=dict(
        type='APNHead',
        num_classes=1,
        in_channels=1024,
        output_type='coral',
        loss=dict(type='ApnCORALLoss', uncorrelated_progs='ignore'),
        spatial_type='avg3d',
        clip_len=32))
train_cfg = dict(untrimmed=False)
test_cfg = dict(untrimmed=True)
dataset_type = 'APNDataset'
data_root_train = ('data/thumos14/rawframes/train',
                   'data/thumos14/rawframes/val')
data_root_val = 'data/thumos14/rawframes/test'
ann_file_train = ('data/thumos14/ann_train.csv', 'data/thumos14/ann_val.csv')
ann_file_val = 'data/thumos14/ann_test.csv'
img_norm_cfg = dict(mean=[128, 128, 128], std=[128, 128, 128], to_bgr=False)
train_pipeline = [
    dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
    dict(type='LabelToOrdinal'),
    dict(type='RawFrameDecode'),
    dict(type='Resize', scale=(224, 224), keep_ratio=False),
    dict(type='Flip', flip_ratio=0.5),
    dict(
        type='Normalize',
        mean=[128, 128, 128],
        std=[128, 128, 128],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCTHW'),
    dict(
        type='Collect',
        keys=['imgs', 'progression_label', 'class_label'],
        meta_keys=()),
    dict(type='ToTensor', keys=['imgs', 'progression_label', 'class_label'])
]
val_pipeline = [
    dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
    dict(type='LabelToOrdinal'),
    dict(type='RawFrameDecode'),
    dict(type='Resize', scale=(224, 224), keep_ratio=False),
    dict(
        type='Normalize',
        mean=[128, 128, 128],
        std=[128, 128, 128],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCTHW'),
    dict(
        type='Collect',
        keys=['imgs', 'progression_label', 'class_label'],
        meta_keys=()),
    dict(type='ToTensor', keys=['imgs', 'progression_label', 'class_label'])
]
test_pipeline = [
    dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
    dict(type='RawFrameDecode'),
    dict(type='Resize', scale=(224, 224), keep_ratio=False),
    dict(
        type='Normalize',
        mean=[128, 128, 128],
        std=[128, 128, 128],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCTHW'),
    dict(type='Collect', keys=['imgs'], meta_keys=()),
    dict(type='ToTensor', keys=['imgs'])
]
data = dict(
    videos_per_gpu=10,
    workers_per_gpu=8,
    train=dict(
        type='APNDataset',
        ann_files=('data/thumos14/ann_train.csv', 'data/thumos14/ann_val.csv'),
        pipeline=[
            dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
            dict(type='LabelToOrdinal'),
            dict(type='RawFrameDecode'),
            dict(type='Resize', scale=(224, 224), keep_ratio=False),
            dict(type='Flip', flip_ratio=0.5),
            dict(
                type='Normalize',
                mean=[128, 128, 128],
                std=[128, 128, 128],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCTHW'),
            dict(
                type='Collect',
                keys=['imgs', 'progression_label', 'class_label'],
                meta_keys=()),
            dict(
                type='ToTensor',
                keys=['imgs', 'progression_label', 'class_label'])
        ],
        data_prefixes=('data/thumos14/rawframes/train',
                       'data/thumos14/rawframes/val'),
        filename_tmpl='img_{:05}.jpg',
        modality='RGB',
        unittest=False),
    val=dict(
        type='APNDataset',
        ann_files='data/thumos14/ann_test.csv',
        pipeline=[
            dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
            dict(type='LabelToOrdinal'),
            dict(type='RawFrameDecode'),
            dict(type='Resize', scale=(224, 224), keep_ratio=False),
            dict(
                type='Normalize',
                mean=[128, 128, 128],
                std=[128, 128, 128],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCTHW'),
            dict(
                type='Collect',
                keys=['imgs', 'progression_label', 'class_label'],
                meta_keys=()),
            dict(
                type='ToTensor',
                keys=['imgs', 'progression_label', 'class_label'])
        ],
        data_prefixes='data/thumos14/rawframes/test',
        filename_tmpl='img_{:05}.jpg',
        modality='RGB',
        unittest=False),
    test=dict(
        type='APNDataset',
        ann_files='data/thumos14/ann_test.csv',
        pipeline=[
            dict(type='FetchStackedFrames', clip_len=32, frame_interval=4),
            dict(type='RawFrameDecode'),
            dict(type='Resize', scale=(224, 224), keep_ratio=False),
            dict(
                type='Normalize',
                mean=[128, 128, 128],
                std=[128, 128, 128],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCTHW'),
            dict(type='Collect', keys=['imgs'], meta_keys=()),
            dict(type='ToTensor', keys=['imgs'])
        ],
        data_prefixes='data/thumos14/rawframes/test',
        filename_tmpl='img_{:05}.jpg',
        modality='RGB',
        untrimmed=True))
optimizer = dict(type='Adam', lr=0.0001)
optimizer_config = dict(grad_clip=None)
lr_config = dict(policy='fixed')
total_epochs = 10
evaluation = dict(
    interval=1,
    key_indicator='mae',
    metrics=['loss', 'mae'],
    results_component=('losses', 'progressions'),
    dataset_name='Val')
checkpoint_config = dict(interval=1)
log_config = dict(
    interval=500,
    hooks=[dict(type='TensorboardLoggerHook'),
           dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/apn_prop_coral_r3dsony_32x4_10e_thumos14_rgb/'
load_from = None
resume_from = None
workflow = [('train', 1)]
output_config = dict(
    out='./work_dirs/apn_prop_coral_r3dsony_32x4_10e_thumos14_rgb//results.json'
)
gpu_ids = range(0, 1)
omnisource = False
module_hooks = []

2021-06-14 12:23:05,624 - mmaction - INFO - load model from: checkpoints/r3d_sony/model_rgb.pth
INFO:mmaction:load model from: checkpoints/r3d_sony/model_rgb.pth
2021-06-14 12:23:05,624 - mmaction - INFO - Use load_from_local loader
INFO:mmaction:Use load_from_local loader
2021-06-14 12:23:05,657 - mmaction - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: conv3d_0c_1x1.conv3d.weight, conv3d_0c_1x1.conv3d.bias

WARNING:mmaction:The model and loaded state dict do not match exactly

unexpected key in source state_dict: conv3d_0c_1x1.conv3d.weight, conv3d_0c_1x1.conv3d.bias

2021-06-14 12:23:07,384 - mmaction - INFO - Start running, host: louis@louis-4, work_dir: /home/louis/PycharmProjects/APN/work_dirs/apn_prop_coral_r3dsony_32x4_10e_thumos14_rgb
INFO:mmaction:Start running, host: louis@louis-4, work_dir: /home/louis/PycharmProjects/APN/work_dirs/apn_prop_coral_r3dsony_32x4_10e_thumos14_rgb
2021-06-14 12:23:07,385 - mmaction - INFO - workflow: [('train', 1)], max: 10 epochs
INFO:mmaction:workflow: [('train', 1)], max: 10 epochs
2021-06-14 12:23:31,919 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration.
INFO:mmcv:Reducer buckets have been rebuilt in this iteration.
2021-06-14 12:28:20,210 - mmaction - INFO - Epoch [1][500/72096]    lr: 1.000e-04, eta: 5 days, 5:11:45, time: 0.626, data_time: 0.047, memory: 8277, loss: 3.5669
INFO:mmaction:Epoch [1][500/72096]  lr: 1.000e-04, eta: 5 days, 5:11:45, time: 0.626, data_time: 0.047, memory: 8277, loss: 3.5669
2021-06-14 12:33:13,243 - mmaction - INFO - Epoch [1][1000/72096]   lr: 1.000e-04, eta: 5 days, 1:09:26, time: 0.586, data_time: 0.000, memory: 8277, loss: 1.3251
INFO:mmaction:Epoch [1][1000/72096] lr: 1.000e-04, eta: 5 days, 1:09:26, time: 0.586, data_time: 0.000, memory: 8277, loss: 1.3251

In the training logs, there are two duplicate info lines, even the configs and environment info were printed two times. One info starting with time which the other one doesn't have. Any suggestions?

makecent commented 3 years ago

I found that the logger <RootLogger root (WARNING)>, which is the parent of logger <Logger mmaction (INFO)>, contains a level 0 handler <StreamHandler <stderr> (NOTSET)>. Is this a bug?

innerlee commented 3 years ago

Thanks for the report. It is tracked here https://github.com/open-mmlab/mmcv/issues/1000