InnerPeace-Wu / densecap-tensorflow

Re-implement CVPR2017 paper: "dense captioning with joint inference and visual context" and minor changes in Tensorflow. (mAP 8.296 after 500k iters of training)
MIT License
61 stars 28 forks source link

trainning myself model meet the issues AttributeError: 'NoneType' object has no attribute 'astype' #12

Closed QianyuanLiu closed 6 years ago

QianyuanLiu commented 6 years ago

Hello, I am a new TF-boy and help some help :) In your tutorial,

Trainning model, Add or modify configurations in root/scripts/dense_cap_config.yml, refer to 'lib/config.py' for more configuration details.

$ cd $ROOT $ bash scripts/dense_cap_train.sh [dataset] [net] [ckpt_to_init] [data_dir] [step]

when i try to train my model by typing :

sudo CUDA_VISIBLE_DEVICES="0" bash scripts/dense_cap_train.sh visual_genome_1.2 res50 resnet_v1_50/res50.ckpt git/visual_genome 1

I get the ERROR:

+ set -e
+ export PYTHONUNBUFFERED=True
+ PYTHONUNBUFFERED=True
+ DATASET=visual_genome_1.2
+ NET=res50
+ ckpt_path=resnet_v1_50/res50.ckpt
+ data_dir=git/visual_genome
+ step=0
+ '[' -d /home/joe ']'
+ case $DATASET in
+ TRAIN_IMDB=vg_1.2_train
+ TEST_IMDB=vg_1.2_val
+ PT_DIR=dense_cap
+ FINETUNE_AFTER1=200000
+ FINETUNE_AFTER2=100000
+ ITERS1=400000
+ ITERS2=300000
+ '[' -d /valohai/outputs ']'
++ date +%Y-%m-%d_%H-%M-%S
+ LOG=logs/s0_res50_vg_1.2_train.txt.2018-04-07_22-09-27
+ exec
++ tee -a logs/s0_res50_vg_1.2_train.txt.2018-04-07_22-09-27
+ echo Logging output to logs/s0_res50_vg_1.2_train.txt.2018-04-07_22-09-27
Logging output to logs/s0_res50_vg_1.2_train.txt.2018-04-07_22-09-27
+ '[' 0 -lt 2 ']'
+ python ./tools/train_net.py --weights resnet_v1_50/res50.ckpt --imdb vg_1.2_train --imdbval vg_1.2_val --iters 200000 --cfg scripts/dense_cap_config.yml --data_dir git/visual_genome --net res50 --set EXP_DIR dc_conv_fixed CONTEXT_FUSION False RESNET.FIXED_BLOCKS 3
------ called with args: -------
Namespace(cfg_file='scripts/dense_cap_config.yml', context_fusion=False, data_dir='git/visual_genome', device='gpu', device_id=0, embed_dim=512, imdb_name='vg_1.2_train', imdbval_name='vg_1.2_val', max_iters=200000, net='res50', randomize=False, set_cfgs=['EXP_DIR', 'dc_conv_fixed', 'CONTEXT_FUSION', 'False', 'RESNET.FIXED_BLOCKS', '3'], tag=None, weights='resnet_v1_50/res50.ckpt')
runing with LIMIT_RAM: True
Using config:
{'ALL_TEST': False,
 'ALL_TEST_NUM_TEST': 1000,
 'ALL_TEST_NUM_TRAIN': 100,
 'ALL_TEST_NUM_VAL': 100,
 'ANCHOR_RATIOS': [0.5, 1, 2],
 'ANCHOR_SCALES': [4, 8, 16, 32],
 'CACHE_DIR': '/home/joe/git/visual_genome/1.2',
 'CONTEXT_FUSION': False,
 'CONTEXT_FUSION_MODE': 'sum',
 'CONTEXT_MODE': 'concat',
 'DATA_DIR': 'git/visual_genome',
 'DEBUG_ALL': False,
 'EMBED_DIM': 512,
 'END_INDEX': 2,
 'EXP_DIR': 'dc_conv_fixed',
 'FILTER_SMALL_BOX': False,
 'GLOVE_DIM': 300,
 'GPU_ID': 0,
 'INIT_BY_GLOVE': False,
 'KEEP_AS_GLOVE_DIM': False,
 'LIMIT_RAM': True,
 'LOG_DIR': '/home/XXX/docment/densecap-tensorflow/logs',
 'LOSS': {'BBOX_W': 0.01,
          'CAP_W': 1.0,
          'CLS_W': 0.1,
          'RPN_BBOX_W': 0.05,
          'RPN_CLS_W': 0.1},
 'MAX_WORDS': 10,
 'PIXEL_MEANS': array([[[ 102.9801,  115.9465,  122.7717]]]),
 'POOLING_MODE': 'crop',
 'POOLING_SIZE': 7,
 'RESNET': {'FIXED_BLOCKS': 3, 'MAX_POOL': False},
 'RNG_SEED': 3,
 'ROOT_DIR': '/home/XXX/docment/densecap-tensorflow',
 'RPN_CHANNELS': 512,
 'SAMPLE_NUM_FIXED_REGIONS': False,
 'SPLIT_DIR': '/home/XXX/docment/densecap-tensorflow/info',
 'TEST': {'BBOX_REG': True,
          'BEAM_SIZE': 3,
          'HAS_RPN': True,
          'LN_FACTOR': 0.0,
          'MAX_SIZE': 720,
          'MODE': 'nms',
          'NMS': 0.5,
          'PROPOSAL_METHOD': 'gt',
          'RPN_MIN_SIZE': 16,
          'RPN_NMS_THRESH': 0.6,
          'RPN_POST_NMS_TOP_N': 300,
          'RPN_PRE_NMS_TOP_N': 6000,
          'RPN_TOP_N': 5000,
          'SCALES': [600],
          'SVM': False,
          'USE_BEAM_SEARCH': False},
 'TIME_STEPS': 12,
 'TRAIN': {'ASPECT_GROUPING': True,
           'BATCH_SIZE': 256,
           'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
           'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
           'BBOX_NORMALIZE_TARGETS': True,
           'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
           'BBOX_REG': True,
           'BBOX_THRESH': 0.5,
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0.0,
           'BIAS_DECAY': False,
           'CLIP_NORM': 40.0,
           'DISPLAY': 10,
           'DOUBLE_BIAS': False,
           'EXP_DECAY_RATE': 0.9,
           'EXP_DECAY_STEPS': 5000,
           'FG_FRACTION': 0.5,
           'FG_THRESH': 0.5,
           'GAMMA': 0.5,
           'HAS_RPN': True,
           'IMS_PER_BATCH': 1,
           'LEARNING_RATE': 0.001,
           'LR_DIY_DECAY': True,
           'MAX_SIZE': 720,
           'MOMENTUM': 0.98,
           'OPTIMIZER': 'sgd_m',
           'PROPOSAL_METHOD': 'gt',
           'RPN_BATCHSIZE': 256,
           'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'RPN_CLOBBER_POSITIVES': False,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 16,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POSITIVE_WEIGHT': -1.0,
           'RPN_POST_NMS_TOP_N': 2000,
           'RPN_PRE_NMS_TOP_N': 12000,
           'SCALES': [600],
           'SNAPSHOT_ITERS': 5000,
           'SNAPSHOT_KEPT': 3,
           'SNAPSHOT_PREFIX': 'res50_densecap',
           'STEPSIZE': [100000],
           'SUMMARY_INTERVAL': 10,
           'USE_FLIPPED': True,
           'USE_GT': False,
           'WEIGHT_DECAY': 0.0001,
           'WEIGHT_INITIALIZER': 'normal'},
 'TRAIN_GLOVE': False,
 'USE_GPU_NMS': True,
 'VOCAB_END_ID': 2,
 'VOCAB_SIZE': 10000,
 'VOCAB_START_ID': 1}
data_path: git/visual_genome/1.2
loading splits from /home/XXX/docment/densecap-tensorflow/info/densecap_splits.json
Number of examples: 77398
train gt roidb could be loaded from git/visual_genome/1.2_cache/train_gt_roidb
Getting gt roidb and number of examples is:154796
output will be saved to `/home/XXX/docment/densecap-tensorflow/output/dc_conv_fixed/vg_1.2_train`
TensorFlow summaries will be saved to `/home/zxf/docment/densecap-tensorflow/output/dc_conv_fixed/tb/vg_1.2_train/default`
data_path: git/visual_genome/1.2
loading splits from /home/XXX/docment/densecap-tensorflow/info/densecap_splits.json
Number of examples: 5000
val gt roidb could be loaded from git/visual_genome/1.2_cache/val_gt_roidb
Getting gt roidb and number of examples is:5000
2018-04-07 22:09:29.615617: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-04-07 22:09:29.765218: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce GTX TITAN major: 3 minor: 5 memoryClockRate(GHz): 0.8755
pciBusID: 0000:04:00.0
totalMemory: 5.94GiB freeMemory: 5.86GiB
2018-04-07 22:09:29.765255: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX TITAN, pci bus id: 0000:04:00.0, compute capability: 3.5)
Solving...
LIMIT_RAM version and load index from git/visual_genome/1.2_cache/train_gt_roidb/image_index.json
LIMIT_RAM version and load index from git/visual_genome/1.2_cache/val_gt_roidb/image_index.json
Fixing 3 blocks.
Initialize embedding vectors with default initializer.
Shape of embedding is (10003, 512)
learning rate 0.001
/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py:96: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Loading initial model weights from resnet_v1_50/res50.ckpt
Variables restored: resnet_v1_50/conv1/BatchNorm/gamma:0
.....
.....
.....
Variables restored: resnet_v1_50/block4/unit_3/bottleneck_v1/conv3/BatchNorm/beta:0
Variables restored: resnet_v1_50/block4/unit_3/bottleneck_v1/conv3/BatchNorm/moving_mean:0
Variables restored: resnet_v1_50/block4/unit_3/bottleneck_v1/conv3/BatchNorm/moving_variance:0
Loaded.
Fix Resnet V1 layers..
Fixed.
Ckpt path: resnet_v1_50/res50.ckpt
Traceback (most recent call last):
  File "./tools/train_net.py", line 214, in <module>
    main()
  File "./tools/train_net.py", line 210, in main
    max_iters=args.max_iters)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/dense_cap/train.py", line 485, in train_net
    sw.train_model(sess, max_iters)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/dense_cap/train.py", line 356, in train_model
    blobs = self.data_layer.forward()
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/fast_rcnn/layer.py", line 99, in forward
    blobs = self._get_next_minibatch()
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/fast_rcnn/layer.py", line 95, in _get_next_minibatch
    return get_minibatch(minibatch_db)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/fast_rcnn/minibatch.py", line 41, in get_minibatch
    im_blob, im_scales, roidb = _get_image_blob(roidb, random_scale_inds)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/fast_rcnn/minibatch.py", line 101, in _get_image_blob
    cfg.TRAIN.MAX_SIZE)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/utils/blob.py", line 37, in prep_im_for_blob
    im = im.astype(np.float32, copy=False)
AttributeError: 'NoneType' object has no attribute 'astype'

There are my sys-info:

ubuntu16.04 GeForce GTX TITAN RAM:8G CUDA:8.0 cudnn:6.0 TF:1.4

I review the code and think the net wasn't feed any data, I locate the issues in this file: /lib/utils/blob.py

def prep_im_for_blob(im, pixel_means, target_size, max_size):  
    """Mean subtract and scale an image for use in a blob."""
    im = im.astype(np.float32, copy=False)
    im -= pixel_means  
    im_shape = im.shape  
    im_size_min = np.min(im_shape[0:2])  
    im_size_max = np.max(im_shape[0:2])  
    im_scale = float(target_size) / float(im_size_min)  
    # Prevent the biggest axis from being more than MAX_SIZE  
    if np.round(im_scale * im_size_max) > max_size:  
        im_scale = float(max_size) / float(im_size_max)  
    im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale,  
                    interpolation=cv2.INTER_LINEAR)  
 return im, im_scale  

so, i guess the net read nothing from roidb. and how do i fix it ?

PS: the tree of my workspace:

├── data
│   ├── ckpt_ori
│   ├── imagenet_weights
│   ├── resnet_v1_50_2016_08_28.tar.gz
│   ├── resnet_v2_101_2017_04_14.tar.gz
│   └── resnet_v2_50_2017_04_14.tar.gz
├── git
│   └── visual_genome
├── git_0
│   └── visual_genome
├── info
│   ├── densecap_splits.json
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── read_regions.py
│   ├── read_splits.py
│   ├── read_splits.pyc
│   ├── test.txt
│   ├── train.txt
│   └── val.txt
├── __init__.py
├── lib
│   ├── config.py
│   ├── config.pyc
│   ├── datasets
│   ├── dense_cap
│   ├── download_data_vh.sh
│   ├── fast_rcnn
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── layers
│   ├── limit_ram
│   ├── Makefile
│   ├── nets
│   ├── nms
│   ├── pre_glove.py
│   ├── preprocess.py
│   ├── preprocess.sh
│   ├── pycocoevalcap
│   ├── setup.py
│   └── utils
├── LICENSE
├── logs
│   ├── densecap.png
│   ├── funny.png
│   ├── s0_res50_vg_1.2_train.txt.2018-04-07_16-02-00
├── Note.md
├── output
│   ├── dc_context
│   ├── dc_conv_fixed
│   ├── dc_tune_context
│   └── dc_tune_conv
├── README.md
├── requirements.txt
├── res50
│   └── res50.ckpt
├── scripts
│   ├── dense_cap_config.yml
│   ├── dense_cap_demo.sh
│   ├── dense_cap_test.sh
│   ├── dense_cap_train.sh
│   └── old_dense_cap_train.sh
├── tests
│   ├── architecture_test.py
│   ├── bash_log_test
│   ├── ckpt_restore_test.py
│   ├── dencap_oa_test.sh
│   ├── __init__.py
│   ├── logs
│   ├── pickle_read_test.py
│   ├── README.md
│   ├── read_regions_json
│   ├── roidata_test.py
│   ├── sentence_data_layer_test.py
│   └── vh_train_command.sh
├── tools
│   ├── demo.py
│   ├── _init_paths.py
│   ├── _init_paths.pyc
│   ├── __init__.py
│   ├── test_net.py
│   └── train_net.py
├── valohai.yaml
├── VG
│   ├── 1.2
│   └── images
└── vis
    ├── d3.min.js
    ├── jquery-1.8.3.min.js
    ├── README.md
    ├── style.css
    ├── utils.js
    └── view_results.html

any help will be appreciated. THX : )

InnerPeace-Wu commented 6 years ago

Hi, Firstly, make sure that you did finish the data preparing steps properly. There may be something wrong of your roidb. To do further debugging, you print out the value of roidb[i]['image'] before the line, and to see that if the path is valid. One simple solution maybe deleting the directory of 1.2_cache and re-generate roidb. Wish it helps.

QianyuanLiu commented 6 years ago

@InnerPeace-Wu Very thankful for your reply ! 1)
I recheck the folder /home/XXX/docment/densecap-tensorflow/git/visual_genome/1.2_cache there are 154797 files in folder train_gt_roidb and 5001 files in val_gt_roidb

└── visual_genome
    ├── 1.2
    │   ├── test_gt_regions
    │   ├── train_gt_regions
    │   ├── val_gt_regions
    │   └── vocabulary.txt
    └── 1.2_cache
        ├── train_gt_roidb
        └── val_gt_roidb

2) I print the value roidb[0]['image'] and get this:

*************************************************************************
im = cv2.imread(roidb[0]['image'])
None

3) I delete the old and re-generate roidb.

XXX@XXX-PC:~/docment/densecap-tensorflow/info$ python read_regions.py --version 1.2 --vg_path 
 ../VG 
XXX@XXX-PC:~/docment/densecap-tensorflow/info$ cd ../lib/
XXX@XXX-PC:~/docment/densecap-tensorflow/lib$ python preprocess.py --version 1.2 --path ../VG  --output_dir ../git/visual_genome --max_words 10 --limit_ram 
split image number: 77398 for split name: train
start loading json files...
0.590941 seconds for loading
train: 100%|███████████████████████████| 108077/108077 [03:08<00:00, 573.80it/s]
processing train set with time: 188.39 seconds
there are 272 invalid bboxes out of 3684063
there are 3 empty phrases after triming
Found 56945 unique word tokens.
Using vocabulary size 10000.
The least frequent word in our vocabulary is 'ruff' and appeared 14 times.
Dumping vocabulary to file: ../git/visual_genome/1.2/vocabulary.txt
Done.
split image number: 5000 for split name: val
start loading json files...
0.452126 seconds for loading
val: 100%|████████████████████████████| 108077/108077 [00:18<00:00, 5692.37it/s]
processing val set with time: 18.99 seconds
there are 14 invalid bboxes out of 237362
there are 0 empty phrases after triming
split image number: 5000 for split name: test
start loading json files...
0.303303 seconds for loading
test: 100%|███████████████████████████| 108077/108077 [00:22<00:00, 4841.91it/s]
processing test set with time: 22.32 seconds
there are 17 invalid bboxes out of 238069
there are 0 empty phrases after triming

when i try to train by typing:

sudo CUDA_VISIBLE_DEVICES="0" bash scripts/dense_cap_train.sh 
 isual_genome_1.2 res50 resnet_v1_50/res50.ckpt git/visual_genome 1

the same ERROR and same massages are ouput .

Ckpt path: resnet_v1_50/res50.ckpt
Traceback (most recent call last):
  File "./tools/train_net.py", line 214, in <module>
    main()
  File "./tools/train_net.py", line 210, in main
    max_iters=args.max_iters)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/dense_cap/train.py", line 485, in train_net
    sw.train_model(sess, max_iters)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/dense_cap/train.py", line 356, in train_model
    blobs = self.data_layer.forward()
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/fast_rcnn/layer.py", line 99, in forward
    blobs = self._get_next_minibatch()
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/fast_rcnn/layer.py", line 95, in _get_next_minibatch
    return get_minibatch(minibatch_db)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/fast_rcnn/minibatch.py", line 41, in get_minibatch
    im_blob, im_scales, roidb = _get_image_blob(roidb, random_scale_inds)
  File "/home/XXXdocment/densecap-tensorflow/tools/../lib/fast_rcnn/minibatch.py", line 101, in _get_image_blob
    cfg.TRAIN.MAX_SIZE)
  File "/home/XXX/docment/densecap-tensorflow/tools/../lib/utils/blob.py", line 37, in prep_im_for_blob
    im = im.astype(np.float32, copy=False)
AttributeError: 'NoneType' object has no attribute 'astype'

Is there my command any wrong ? other quetion Should i add or modify configurations in root/scripts/dense_cap_config.yml ? waitting any help :( thx !

InnerPeace-Wu commented 6 years ago

Please add code

print(roidb[0]['image'])

before

im = cv2.imread(roidb[0]['image'])

to see the if the path if valid.

QianyuanLiu commented 6 years ago

@InnerPeace-Wu Hello,I add this code :print(roidb[0]['image'])in here:

    for i in xrange(num_images):
        **print(roidb[0]['image']) ##**
        im = cv2.imread(roidb[i]['image'])
        if roidb[i]['flipped']:
            im = im[:, ::-1, :]
        target_size = cfg.TRAIN.SCALES[scale_inds[i]]
        im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,
                                        cfg.TRAIN.MAX_SIZE)
        im_scales.append(im_scale)
        processed_ims.append(im)

and get some output massage like this: ../VG/images/2409215.jpg

and same error info: AttributeError: 'NoneType' object has no attribute 'astype'

PS: I add this code print (im) following closely theim = cv2.imread(roidb[0]['image']) last, print like this:

../VG/images/2409215.jpg
<type 'NoneType'>
None

so, is cv2.read failed ?

InnerPeace-Wu commented 6 years ago

The problem seems about your image path, where you used a relative path as ../VG/images/2409215.jpg (I bet the image id 2409215 is valid, is it?). However you should use a absolute path like /home/user/git/VG/images/id.jpg. One way to solve this, you should re-preprocessing data: Firstly, deleting the directory of 1.2_cache. Secondly, preprocessing the data with code:

$ cd $ROOT/lib
$ python preprocess.py --version [version] --path [raw_data_path] \
        --output_dir [dir] --max_words [max_len] --limit_ram

Note: the [raw_data_path] should be the absolute path of VG like /home/user/git/VG. Or you can refer to my example code.

For your case, you can easily address the issue by changing the relative path to the absolute one by changing the line to:

path = '/home/XXX/docment/densecap-tensorflow' + roidb[i]['image'][2:]
im = cv2.imread(path)

Best,

QianyuanLiu commented 6 years ago

congratulation ! It works ! Thank you ! 👍 👍 👍