MaybeShewill-CV / bisenetv2-tensorflow

Unofficial tensorflow implementation of real-time scene image segmentation model "BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation"
https://maybeshewill-cv.github.io/bisenetv2-tensorflow/
MIT License
224 stars 59 forks source link

Error while training of proprietary data #22

Closed hona-p closed 4 years ago

hona-p commented 4 years ago

change dataset to proprietary data (1920*1080 pixel), and training with command

CUDA_VISIBLE_DEVICES="0, 1, 2, 3" python tools/cityscapes/train_bisenetv2_cityscapes.py

The program worked fine when using the cityscapes dataset (20481024 pixel).However, when I used my own data (19201080 pixel), I got the following error The TRAIN_CROP_SIZE and EVAL_CROP_SIZE in cityscapes_bisenetv2.yaml have also been changed as follows Could you please tell me where the problem is?

Error; InvalidArgumentError (see above for traceback): slice index 262144 of dimension 0 out of bounds.

There was a similar question on the #19 issues, but I didn't know what exactly to do with the explanation there.

image

MaybeShewill-CV commented 4 years ago

@yakubota Reduce ohem min_sample_nums to 259200 will fix this.

hona-p commented 4 years ago

Thank you for contacting me. I changed in_sample_nums to 259200, but I got the following error. Also, the training will not run. Where should I fix this?

image

MaybeShewill-CV commented 4 years ago

@yakubota I do not know the exact input tensor size of your dataset. If your input label's shape was (height, width) you may set the min_sample_nums into heightwidth0.25.

hona-p commented 4 years ago

The image used for training is 1080pixel in height and 1920pixel in width. When the resolution was 2097152, the MIN_SAMPLE_NUM was 262144 (height: 1024pixel x width: 2048pixel) . I interpreted the number of resolutions divided by 8 to be entered into MIN_SAMPLE_NUM. Since the resolution of the image I use is 2073600, I entered 259200 for MIN_SAMPLE_NUM. But I get the above error.

MaybeShewill-CV commented 4 years ago

@yakubota 1. The input tensor will be downsampled https://github.com/MaybeShewill-CV/bisenetv2-tensorflow/blob/fb795a29a72d00c71c341d97a7cacebcc990b390/local_utils/augment_utils/cityscapes/augmentation_tf_utils.py#L464-L478 2. Please check if your tfrecords was rightly generated. The tfrecords you generate may have a wrong tensor shape.

hona-p commented 4 years ago

Where can I see the shape of the generated tfrecords tensor?

MaybeShewill-CV commented 4 years ago

@yakubota Decode the tfrecords and print the image shape. As for decode method you may google how to read image from Tensorflow records file:)

hona-p commented 4 years ago

When I ran the following script, it output the appropriate number of data. "... _gtFine_labelTrainIds.png" to check and investigate if there is a problem with the way it was created.


OUTPUT_TFRECORD_NAME = "cityscapes_val.tfrecords" cnt = len(list(tf.python_io.tf_record_iterator(OUTPUT_TFRECORD_NAME))) print("number of data:{}".format(cnt))

OUTPUT_TFRECORD_NAME = "cityscapes_train.tfrecords" cnt = len(list(tf.python_io.tf_record_iterator(OUTPUT_TFRECORD_NAME))) print("number of data:{}".format(cnt))