Suspicious flownet warmup logs

LeftAttention commented 2 years ago

During the flow net warm up. I am getting the following logs. I am not understanding why all the values are zero.

[flownet_warmup][iter-50]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-100]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-150]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-200]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-250]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-300]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-350]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-400]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-450]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-500]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-550]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-600]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-650]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-700]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-750]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-800]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-850]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-900]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-950]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1000]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1050]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1100]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1150]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1200]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1250]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1300]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1350]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1400]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1450]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1500]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1550]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1600]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1650]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1700]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1750]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1800]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1850]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1900]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-1950]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
[flownet_warmup][iter-2000]G_GAN_pose: 0.0000, G_GAN_content: 0.0000, D_real_pose: 0.0000, D_fake_pose: 0.0000, D_real_content: 0.0000, D_fake_content: 0.0000, rec: 0.0000, per: 0.0000, sty: 0.0000, seg: 0.0000, flow_reg: 0.0000, flow_cor: 2.5285, 
saving the latest model (total_iters 2000)
End of iter 2000 / 120000        Time Taken: 1220 sec
at 2000, compute visuals

These are my training parameters.

----------------- Options ---------------
               batch_size: 32                             
                    beta1: 0.5                           
                    beta2: 0.999                         
          checkpoints_dir: checkpoints                   
           continue_train: True                             [default: False]
                crop_size: 256                           
                 dataroot: data/fashion/                    [default: data]
             display_freq: 2000                          
                    epoch: latest                        
              epoch_count: 1                             
             flownet_path:                               
               frozen_enc: False                         
           frozen_flownet: False                         
            frozen_models:                               
                g2d_ratio: 0.1                           
                 gan_mode: lsgan                         
                  gpu_ids: 0,1,2,3                          [default: 0]
                  img_dir: img_highres                   
                init_gain: 0.02                          
                init_type: orthogonal                       [default: kaiming]
                  isTrain: True                             [default: None]
                load_iter: 0                                [default: 0]
             loss_coe_GAN: 1                             
        loss_coe_flow_cor: 2.0                           
        loss_coe_flow_reg: 0.001                         
             loss_coe_per: 0.0                              [default: 0.2]
             loss_coe_rec: 0.0                              [default: 2]
             loss_coe_seg: 0.1                           
             loss_coe_sty: 0.0                              [default: 200]
                       lr: 0.0001                           [default: 0.001]
           lr_decay_iters: 50                            
                lr_policy: linear                        
           lr_update_unit: 10000                         
           max_batch_size: 16                            
                    model: flow                             [default: adgan]
                   n_cpus: 8                                [default: 4]
             n_downsample: 2                             
                 n_epochs: 60000                         
           n_epochs_decay: 60000                         
            n_human_parts: 8                             
                   n_kpts: 18                            
               n_layers_D: 3                             
           n_style_blocks: 4                             
                     name: flownet_warmup                   [default: experiment_name]
                      ndf: 64                            
                     netD: resnet                        
                     netE: adgan                         
                     netG: adgan                         
                      ngf: 64                            
               no_dropout: False                         
            no_trial_test: True                             [default: False]
                norm_type: instance                      
                  perturb: False                         
                    phase: train                         
                pool_size: 50                            
               print_freq: 50                               [default: 100]
              progressive: False                         
              random_rate: 1                             
                relu_type: leakyrelu                     
             save_by_iter: False                         
          save_epoch_freq: 20000                         
         save_latest_freq: 2000                          
             segm_dataset:                               
                   square: False                         
                 style_nc: 64                            
                   suffix:                               
                  tex_dir: dtd/images                    
                  verbose: False                         
                   warmup: False                         
----------------- End -------------------

cuiaiyu commented 2 years ago

It seems the loss not changing either. So losses should all be zero except for flow_reg and flow_cor. Did you install the cuda function from GFLA correctly? Could you try to run it on a single GPU to see if it works?

LeftAttention commented 2 years ago

This is my system specs

NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4

And I have four Tesla K80 GPUs. Because of that I changed Global-Flow-Local-Attention/model/networks/block_extractor/setup.py , Global-Flow-Local-Attention/model/networks/local_attn_reshape/setup.py and Global-Flow-Local-Attention/model/networks/resample2d_package/setup.py.

I modified the nvcc_args to the following.

nvcc_args = [
    '-gencode', 'arch=compute_86,code=sm_86'
    #'-gencode', 'arch=compute_50,code=sm_50',
    #'-gencode', 'arch=compute_52,code=sm_52',
    # '-gencode', 'arch=compute_60,code=sm_60',
    # '-gencode', 'arch=compute_61,code=sm_61',
    # '-gencode', 'arch=compute_70,code=sm_70',
    # '-gencode', 'arch=compute_70,code=compute_70'
]

Please guide me how to setup this in k80 GPU and cuda 11.2. Thanks in advance.

LeftAttention commented 2 years ago

I followed this for the environment setup. And these are my installation logs of Global-Flow-Local-Attention custom layers.

running clean
'build/lib.linux-x86_64-3.6' does not exist -- can't clean it
'build/bdist.linux-x86_64' does not exist -- can't clean it
'build/scripts-3.6' does not exist -- can't clean it
running install
running bdist_egg
running egg_info
creating block_extractor_cuda.egg-info
writing block_extractor_cuda.egg-info/PKG-INFO
writing dependency_links to block_extractor_cuda.egg-info/dependency_links.txt
writing top-level names to block_extractor_cuda.egg-info/top_level.txt
writing manifest file 'block_extractor_cuda.egg-info/SOURCES.txt'
reading manifest file 'block_extractor_cuda.egg-info/SOURCES.txt'
writing manifest file 'block_extractor_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'block_extractor_cuda' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /home/bigthinx/anaconda3/envs/gfla/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/TH -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/bigthinx/anaconda3/envs/gfla/include/python3.6m -c block_extractor_cuda.cc -o build/temp.linux-x86_64-3.6/block_extractor_cuda.o -std=c++11 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=block_extractor_cuda -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from block_extractor_cuda.cc:2:0:
/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:7:2: warning: #warning "Including torch/torch.h for C++ extensions is deprecated. Please include torch/extension.h" [-Wcpp]
 #warning \
  ^~~~~~~
/usr/local/cuda/bin/nvcc -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/TH -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/bigthinx/anaconda3/envs/gfla/include/python3.6m -c block_extractor_kernel.cu -o build/temp.linux-x86_64-3.6/block_extractor_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=block_extractor_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
creating build/lib.linux-x86_64-3.6
g++ -pthread -shared -B /home/bigthinx/anaconda3/envs/gfla/compiler_compat -L/home/bigthinx/anaconda3/envs/gfla/lib -Wl,-rpath=/home/bigthinx/anaconda3/envs/gfla/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/block_extractor_cuda.o build/temp.linux-x86_64-3.6/block_extractor_kernel.o -L/usr/local/cuda/lib64 -lcudart -o build/lib.linux-x86_64-3.6/block_extractor_cuda.cpython-36m-x86_64-linux-gnu.so
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-3.6/block_extractor_cuda.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg
creating stub loader for block_extractor_cuda.cpython-36m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/block_extractor_cuda.py to block_extractor_cuda.cpython-36.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying block_extractor_cuda.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying block_extractor_cuda.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying block_extractor_cuda.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying block_extractor_cuda.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
__pycache__.block_extractor_cuda.cpython-36: module references __file__
creating dist
creating 'dist/block_extractor_cuda-0.0.0-py3.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing block_extractor_cuda-0.0.0-py3.6-linux-x86_64.egg
removing '/home/bigthinx/.local/lib/python3.6/site-packages/block_extractor_cuda-0.0.0-py3.6-linux-x86_64.egg' (and everything under it)
creating /home/bigthinx/.local/lib/python3.6/site-packages/block_extractor_cuda-0.0.0-py3.6-linux-x86_64.egg
Extracting block_extractor_cuda-0.0.0-py3.6-linux-x86_64.egg to /home/bigthinx/.local/lib/python3.6/site-packages
block-extractor-cuda 0.0.0 is already the active version in easy-install.pth

Installed /home/bigthinx/.local/lib/python3.6/site-packages/block_extractor_cuda-0.0.0-py3.6-linux-x86_64.egg
Processing dependencies for block-extractor-cuda==0.0.0
Finished processing dependencies for block-extractor-cuda==0.0.0
running clean
'build/lib.linux-x86_64-3.6' does not exist -- can't clean it
'build/bdist.linux-x86_64' does not exist -- can't clean it
'build/scripts-3.6' does not exist -- can't clean it
running install
running bdist_egg
running egg_info
creating local_attn_reshape_cuda.egg-info
writing local_attn_reshape_cuda.egg-info/PKG-INFO
writing dependency_links to local_attn_reshape_cuda.egg-info/dependency_links.txt
writing top-level names to local_attn_reshape_cuda.egg-info/top_level.txt
writing manifest file 'local_attn_reshape_cuda.egg-info/SOURCES.txt'
reading manifest file 'local_attn_reshape_cuda.egg-info/SOURCES.txt'
writing manifest file 'local_attn_reshape_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'local_attn_reshape_cuda' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /home/bigthinx/anaconda3/envs/gfla/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/TH -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/bigthinx/anaconda3/envs/gfla/include/python3.6m -c local_attn_reshape_cuda.cc -o build/temp.linux-x86_64-3.6/local_attn_reshape_cuda.o -std=c++11 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=local_attn_reshape_cuda -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from local_attn_reshape_cuda.cc:2:0:
/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:7:2: warning: #warning "Including torch/torch.h for C++ extensions is deprecated. Please include torch/extension.h" [-Wcpp]
 #warning \
  ^~~~~~~
/usr/local/cuda/bin/nvcc -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/TH -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/bigthinx/anaconda3/envs/gfla/include/python3.6m -c local_attn_reshape_kernel.cu -o build/temp.linux-x86_64-3.6/local_attn_reshape_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=local_attn_reshape_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
creating build/lib.linux-x86_64-3.6
g++ -pthread -shared -B /home/bigthinx/anaconda3/envs/gfla/compiler_compat -L/home/bigthinx/anaconda3/envs/gfla/lib -Wl,-rpath=/home/bigthinx/anaconda3/envs/gfla/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/local_attn_reshape_cuda.o build/temp.linux-x86_64-3.6/local_attn_reshape_kernel.o -L/usr/local/cuda/lib64 -lcudart -o build/lib.linux-x86_64-3.6/local_attn_reshape_cuda.cpython-36m-x86_64-linux-gnu.so
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-3.6/local_attn_reshape_cuda.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg
creating stub loader for local_attn_reshape_cuda.cpython-36m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/local_attn_reshape_cuda.py to local_attn_reshape_cuda.cpython-36.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying local_attn_reshape_cuda.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying local_attn_reshape_cuda.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying local_attn_reshape_cuda.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying local_attn_reshape_cuda.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
__pycache__.local_attn_reshape_cuda.cpython-36: module references __file__
creating dist
creating 'dist/local_attn_reshape_cuda-0.0.0-py3.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing local_attn_reshape_cuda-0.0.0-py3.6-linux-x86_64.egg
removing '/home/bigthinx/.local/lib/python3.6/site-packages/local_attn_reshape_cuda-0.0.0-py3.6-linux-x86_64.egg' (and everything under it)
creating /home/bigthinx/.local/lib/python3.6/site-packages/local_attn_reshape_cuda-0.0.0-py3.6-linux-x86_64.egg
Extracting local_attn_reshape_cuda-0.0.0-py3.6-linux-x86_64.egg to /home/bigthinx/.local/lib/python3.6/site-packages
local-attn-reshape-cuda 0.0.0 is already the active version in easy-install.pth

Installed /home/bigthinx/.local/lib/python3.6/site-packages/local_attn_reshape_cuda-0.0.0-py3.6-linux-x86_64.egg
Processing dependencies for local-attn-reshape-cuda==0.0.0
Finished processing dependencies for local-attn-reshape-cuda==0.0.0
running clean
'build/lib.linux-x86_64-3.6' does not exist -- can't clean it
'build/bdist.linux-x86_64' does not exist -- can't clean it
'build/scripts-3.6' does not exist -- can't clean it
running install
running bdist_egg
running egg_info
creating resample2d_cuda.egg-info
writing resample2d_cuda.egg-info/PKG-INFO
writing dependency_links to resample2d_cuda.egg-info/dependency_links.txt
writing top-level names to resample2d_cuda.egg-info/top_level.txt
writing manifest file 'resample2d_cuda.egg-info/SOURCES.txt'
reading manifest file 'resample2d_cuda.egg-info/SOURCES.txt'
writing manifest file 'resample2d_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'resample2d_cuda' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /home/bigthinx/anaconda3/envs/gfla/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/TH -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/bigthinx/anaconda3/envs/gfla/include/python3.6m -c resample2d_cuda.cc -o build/temp.linux-x86_64-3.6/resample2d_cuda.o -std=c++11 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=resample2d_cuda -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from resample2d_cuda.cc:2:0:
/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:7:2: warning: #warning "Including torch/torch.h for C++ extensions is deprecated. Please include torch/extension.h" [-Wcpp]
 #warning \
  ^~~~~~~
/usr/local/cuda/bin/nvcc -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/TH -I/home/bigthinx/anaconda3/envs/gfla/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/bigthinx/anaconda3/envs/gfla/include/python3.6m -c resample2d_kernel.cu -o build/temp.linux-x86_64-3.6/resample2d_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=resample2d_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
creating build/lib.linux-x86_64-3.6
g++ -pthread -shared -B /home/bigthinx/anaconda3/envs/gfla/compiler_compat -L/home/bigthinx/anaconda3/envs/gfla/lib -Wl,-rpath=/home/bigthinx/anaconda3/envs/gfla/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/resample2d_cuda.o build/temp.linux-x86_64-3.6/resample2d_kernel.o -L/usr/local/cuda/lib64 -lcudart -o build/lib.linux-x86_64-3.6/resample2d_cuda.cpython-36m-x86_64-linux-gnu.so
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-3.6/resample2d_cuda.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg
creating stub loader for resample2d_cuda.cpython-36m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/resample2d_cuda.py to resample2d_cuda.cpython-36.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying resample2d_cuda.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying resample2d_cuda.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying resample2d_cuda.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying resample2d_cuda.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
__pycache__.resample2d_cuda.cpython-36: module references __file__
creating dist
creating 'dist/resample2d_cuda-0.0.0-py3.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing resample2d_cuda-0.0.0-py3.6-linux-x86_64.egg
removing '/home/bigthinx/.local/lib/python3.6/site-packages/resample2d_cuda-0.0.0-py3.6-linux-x86_64.egg' (and everything under it)
creating /home/bigthinx/.local/lib/python3.6/site-packages/resample2d_cuda-0.0.0-py3.6-linux-x86_64.egg
Extracting resample2d_cuda-0.0.0-py3.6-linux-x86_64.egg to /home/bigthinx/.local/lib/python3.6/site-packages
resample2d-cuda 0.0.0 is already the active version in easy-install.pth

Installed /home/bigthinx/.local/lib/python3.6/site-packages/resample2d_cuda-0.0.0-py3.6-linux-x86_64.egg
Processing dependencies for resample2d-cuda==0.0.0
Finished processing dependencies for resample2d-cuda==0.0.0

cuiaiyu commented 2 years ago

I think if there is no device-side bug triggered, then GFLA function is probably install correctly, although I didn't try this on K-80 on my own. Can you try it on a single GPU with smaller batch size to see the behavior for sanity check?

LeftAttention commented 2 years ago

I tried with that, but same issue. If you don't mind, would you please share the flownet trained weight?

There are five different weights here. I am confused which weight should I use. Thanks in advance.

cuiaiyu commented 2 years ago

Please use GFLA's weights of Pose-Guided Person Image Generation (the weights trained on fashion dataset.) Then extract the flownet weights from the latest_net_G.pth (the weight of the full GFLA model) by something like

import torch
weights = torch.load("latest_net_G.pth")
flownet_weights = {w.replace("flow_net.",""): weights[w] for w in weights if w.startswith("flow_net")}
torch.save(flownet_weights, "flownet.pth")

Note GFLA is trained on 256x256 images.

-- marked as solved as no more updates. Will turn it active if there is any follow-up.

cuiaiyu / dressing-in-order

Suspicious flownet warmup logs #8