Code related - Githubissues

jacksonsc007 commented 1 year ago

Thanks for the excellent work. I wonder the release time of the source code. Looking forward to your reply. : )

WANGSSSSSSS commented 1 year ago

NOW!

jacksonsc007 commented 1 year ago

Terrific! Thanks a lot. Could you specify the version of mmdetection btw?

jacksonsc007 commented 1 year ago

And the pytorch version, it seems like you have pytorch>=2.0.0

WANGSSSSSSS commented 1 year ago

System environment: sys.platform: linux Python: 3.8.16 (default, Jun 12 2023, 18:09:05) [GCC 11.2.0] CUDA available: True numpy_random_seed: 1123624972 GPU 0: NVIDIA GeForce RTX 3090 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.8, V11.8.89 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 PyTorch: 2.0.1+cu117 PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.7
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
CuDNN 8.5
Magma 2.6.1
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.15.2+cu117 OpenCV: 4.8.0 MMEngine: 0.8.0

WANGSSSSSSS commented 1 year ago

And the pytorch version, it seems like you have pytorch>=2.0.0

yes, I try to catch up with the rapid world.

jacksonsc007 commented 1 year ago

Hi, while I went through your code, I encountered some issues, hope you could help me out :)

what does "self.grad_accumulation" mean here? And the meaning of "stash gradient"?
for the refine-aware gradient formulation you proposed in equation (11), it seems that you didn't not use this technique in your code implementation to speed up training and save memory, but used naive iteration and autograd of pytorch to deal with backward grad propagation instead. Am I right?

WANGSSSSSSS commented 1 year ago

Hi, while I went through your code, I encountered some issues, hope you could help me out :)

1. what does "self.grad_accumulation" mean [here](https://github.com/MCG-NJU/DEQDet/blob/fa72a62b2340a04300424041e9ebd0087a700eba/projects/deqdet/deq_det_roi_head.py#L219C12-L219C35)? And the meaning of "stash gradient"?

2. for the refine-aware gradient formulation you proposed in equation (11), it seems that you didn't not use this technique in your code implementation to speed up training and save memory, but used naive iteration and **autograd of pytorch** to deal with backward grad propagation instead. Am I right?

The refinement aware gradient is equivalent to the truncated bptt to some extend, cutting off the higher order terms of the rnn iterations. Due to that then each supervision is independent, we can use gradient accumulation between each supervision to avoid the extra memory consumption, but the Autograd in pytorch will push the gradient calculated in single supervision to every parameters, resulting serval back pass to backbone though, so I use this hook to stash gradient to mlvl features, the last backward of the supervision will restore the stashed gradient, and bring stashed gradient to backbone weights

WANGSSSSSSS commented 1 year ago

Hi, while I went through your code, I encountered some issues, hope you could help me out :)

1. what does "self.grad_accumulation" mean [here](https://github.com/MCG-NJU/DEQDet/blob/fa72a62b2340a04300424041e9ebd0087a700eba/projects/deqdet/deq_det_roi_head.py#L219C12-L219C35)? And the meaning of "stash gradient"?

2. for the refine-aware gradient formulation you proposed in equation (11), it seems that you didn't not use this technique in your code implementation to speed up training and save memory, but used naive iteration and **autograd of pytorch** to deal with backward grad propagation instead. Am I right?

For the question 2, yes, the RAG formulation is derived from 2-step unrolled fix-point formulation in paper, the implementation in codebase is that 2-step unrolled fix-point. The equation mainly helps to analyze the reason why two-step better than simple estimation method used in deq-flow. You can find the pesudo code in appendix.

MCG-NJU / DEQDet

Code related #1

Hi, while I went through your code, I encountered some issues, hope you could help me out :)

Hi, while I went through your code, I encountered some issues, hope you could help me out :)

Hi, while I went through your code, I encountered some issues, hope you could help me out :)