pierluigiferrari / ssd_keras

A Keras port of Single Shot MultiBox Detector
Apache License 2.0
1.86k stars 938 forks source link

"list index out of range" loss function issue when running with tf.keras #348

Closed jessicametzger closed 4 years ago

jessicametzger commented 4 years ago

First of all thank you @pierluigiferrari for the great tutorials. I am attempting to train a model using this ssd_keras, using the tf.keras version of keras (the machine I'm using doesn't have keras installed). So far I have succeeded in subsampling for one class (following the weight sampling tutorial), creating training and validation data generator objects, and building a ssd300 model (following the ssd300_training tutorial). The fitting routine worked when using keras (not tf.keras) on a different machine. However, when I attempt to run the model fitting with tf.keras (after making a few changes, see below), I get the following error message:

Traceback (most recent call last):
  File "weight_subsampling_routine_sample.py", line 376, in <module>
    initial_epoch=initial_epoch)
  File "/software/Anaconda3-2019.03-el7-x86_64/envs/msca_gpu_tf2_env/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 1297, in fit_generator
    steps_name='steps_per_epoch')
  File "/software/Anaconda3-2019.03-el7-x86_64/envs/msca_gpu_tf2_env/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_generator.py", line 265, in model_iteration
    batch_outs = batch_function(*batch_data)
  File "/software/Anaconda3-2019.03-el7-x86_64/envs/msca_gpu_tf2_env/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 973, in train_on_batch
    class_weight=class_weight, reset_metrics=reset_metrics)
  File "/software/Anaconda3-2019.03-el7-x86_64/envs/msca_gpu_tf2_env/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 264, in train_on_batch
    output_loss_metrics=model._output_loss_metrics)
  File "/software/Anaconda3-2019.03-el7-x86_64/envs/msca_gpu_tf2_env/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 311, in train_on_batch
    output_loss_metrics=output_loss_metrics))
  File "/software/Anaconda3-2019.03-el7-x86_64/envs/msca_gpu_tf2_env/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 252, in _process_single_batch
    training=training))
  File "/software/Anaconda3-2019.03-el7-x86_64/envs/msca_gpu_tf2_env/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 166, in _model_loss
    per_sample_losses = loss_fn.call(targets[i], outs[i])
IndexError: list index out of range

I have:

There were a few other warnings and the full error message can be found here: full_error_message.txt.

Code

Here is the code I am running. This is after subsampling the weights, importing all packages, etc. as done in the tutorials.

K.clear_session() # Clear previous models from memory.

model = ssd_300(image_size=(img_height, img_width, img_channels),
                n_classes=n_classes,
                mode='training',
                l2_regularization=0.0005,
                scales=scales,
                aspect_ratios_per_layer=aspect_ratios,
                two_boxes_for_ar1=two_boxes_for_ar1,
                steps=steps,
                offsets=offsets,
                clip_boxes=clip_boxes,
                variances=variances,
                normalize_coords=normalize_coords,
                subtract_mean=mean_color,
                swap_channels=swap_channels)

# 2: Load the sub-sampled weights into the model.
weights_path = weights_destination_path
model.load_weights(weights_path, by_name=True)

# compile model
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
ssd_loss = SSDLoss(neg_pos_ratio=3, alpha=1.0)
model.compile(optimizer=adam, loss=ssd_loss.compute_loss)

# CREATE DATA GENERATORS

train_dataset = DataGenerator(load_images_into_memory=False, hdf5_dataset_path=None)
val_dataset = DataGenerator(load_images_into_memory=False, hdf5_dataset_path=None)

# The directories that contain the images/annotations
data_dir = parent_dir+'/data/ssd_keras_data/'
train_image_dir = data_dir+'train_data/'
val_image_dir = data_dir+'val_data/'
train_annotations_file = train_image_dir+'train_annotations.csv'
val_annotations_file = val_image_dir+'val_annotations.csv'

# I only have 1 positive class, so I re-index the class label for the data generator and input encoder,
#  which is what I assume we're supposed to do when subsampling.
classes = ['background', 'sports_ball']
class_ids = [0,1]

train_dataset.parse_csv(train_image_dir, train_annotations_file, 
                                ['image_name','xmin','xmax','ymin','ymax','class_id'],
                                include_classes=class_ids, ret=False, verbose=True)
val_dataset.parse_csv(val_image_dir, val_annotations_file,
                                ['image_name','xmin','xmax','ymin','ymax','class_id'],
                                include_classes=class_ids, ret=False, verbose=True)

train_dataset_size = train_dataset.get_dataset_size()
val_dataset_size   = val_dataset.get_dataset_size()

ssd_data_augmentation = SSDDataAugmentation(img_height=img_height,
                                            img_width=img_width,
                                            background=mean_color)
convert_to_3_channels = ConvertTo3Channels()
resize = Resize(height=img_height, width=img_width)
predictor_sizes = [model.get_layer('conv4_3_norm_mbox_conf').output_shape[1:3],
                   model.get_layer('fc7_mbox_conf').output_shape[1:3],
                   model.get_layer('conv6_2_mbox_conf').output_shape[1:3],
                   model.get_layer('conv7_2_mbox_conf').output_shape[1:3],
                   model.get_layer('conv8_2_mbox_conf').output_shape[1:3],
                   model.get_layer('conv9_2_mbox_conf').output_shape[1:3]]

ssd_input_encoder = SSDInputEncoder(img_height=img_height,
                                    img_width=img_width,
                                    n_classes=n_classes,
                                    predictor_sizes=predictor_sizes,
                                    scales=scales,
                                    aspect_ratios_per_layer=aspect_ratios,
                                    two_boxes_for_ar1=two_boxes_for_ar1,
                                    steps=steps,
                                    offsets=offsets,
                                    clip_boxes=clip_boxes,
                                    variances=variances,
                                    matching_type='multi',
                                    pos_iou_threshold=0.5,
                                    neg_iou_limit=0.5,
                                    normalize_coords=normalize_coords)

# 6: Create the generator handles that will be passed to Keras' `fit_generator()` function.
train_generator = train_dataset.generate(batch_size=batch_size,
                                         shuffle=True,
                                         transformations=[ssd_data_augmentation],
                                         label_encoder=ssd_input_encoder,
                                         returns={'processed_images',
                                                  'encoded_labels'},
                                         keep_images_without_gt=False)
val_generator = val_dataset.generate(batch_size=batch_size,
                                     shuffle=False,
                                     transformations=[convert_to_3_channels,
                                                      resize],
                                     label_encoder=ssd_input_encoder,
                                     returns={'processed_images',
                                              'encoded_labels'},
                                     keep_images_without_gt=False)

history = model.fit_generator(generator=train_generator,
                              steps_per_epoch=steps_per_epoch,
                              epochs=final_epoch,
                              validation_data=val_generator,
                              validation_steps=np.ceil(val_dataset_size/batch_size),
                              initial_epoch=initial_epoch)

Changes made to ssd_keras package

Before getting this error, I had made the following minor changes:

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.