Error in dataset path verification

jmsteitz commented 4 years ago

When I try to run the verification script

python -u ../tests/verify_all_dataset_paths_exist.py

I get the following error:

Finding matches for cityscapes-19-relabeled...
Finding matches for ade20k-150-relabeled...
Finding matches for bdd-relabeled...
Finding matches for coco-panoptic-133-relabeled...
Finding matches for idd-39-relabeled...
Finding matches for sunrgbd-37-relabeled...
Finding matches for mapillary-public65-relabeled...
Writing visual sanity checks for ade20k-151-inst...
Writing visual sanity checks for ade20k-151...
On 0 of ade20k-151
Traceback (most recent call last):
  File "../tests/verify_all_dataset_paths_exist.py", line 250, in <module>
    visual_sanitychecks()
  File "../tests/verify_all_dataset_paths_exist.py", line 94, in visual_sanitychecks
    id_to_class_name_map=id_to_classname_map
  File "/<path>/mseg-api/mseg/utils/mask_utils_detectron2.py", line 477, in overlay_instances
    class_mode_idx = get_most_populous_class(segment_mask, label_map)
  File "/<path>/mseg-api/mseg/utils/mask_utils.py", line 992, in get_most_populous_class
    class_mode_idx = get_np_mode(class_indices)
  File "/<path>/mseg-api/mseg/utils/mask_utils.py", line 931, in get_np_mode
    return np.argmax(counts)
  File "<__array_function__ internals>", line 6, in argmax
  File "/<path>/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 1186, in argmax
    return _wrapfunc(a, 'argmax', axis=axis, out=out)
  File "/<path>/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 61, in _wrapfunc
    return bound(*args, **kwds)
ValueError: attempt to get argmax of an empty sequence

Additonally adding 'ade20k-151', 'ade20k-150', and 'ade20k-150-relabeled' in line 58 to skip them, leads to the same error thrown for the BDD dataset. Thus, the error doesn't seem to be dataset specific.

I'm running python 3.7.6 and numpy 1.18.2.

johnwlambert commented 4 years ago

Hi @jmsteitz , thanks for the heads up. Let me look into this. On which day did you clone the repo? I merged a few PRs since initial release.

jmsteitz commented 4 years ago

I cloned it two days ago, so I should have the latest version.

johnwlambert commented 4 years ago

Ok that's good -- let me run the script here on my local machine and see if I can reproduce it, can you give me 1-2 days to get back to you?

johnwlambert commented 4 years ago

Did you save the output log (stdout) as the download scripts executed? were there any previous errors?

johnwlambert commented 4 years ago

Could you also paste here the exact commands you ran here from the download_scripts readme? Did you upload the Cityscapes, BDD, and WildDash files to your machine after a Chrome download?

johnwlambert commented 4 years ago

I wasn't able to reproduce this error. My guess is that the directory structure may not be correct on your machine. Could you show me a graphical representation of the directories you've configured (as done here), and let me know what you've named MSEG_DST_DIR?

I did notice that cityscapes-34-relabeled wasn't being written out to disk, which halted verification on my machine, so i added a fix for that. But we don't use that for training, so the download, extraction, remapping, and relabeling scripts worked for me.

jmsteitz commented 4 years ago

MSEG_DST_DIR is set to '/fastdata/jmsteitz' on my machine. The folder structure looks like this right now:

mseg_dataset/ADE20K
mseg_dataset/ADE20K/ADE20K_2016_07_26
mseg_dataset/ADE20K/ADE20K_2016_07_26/images
mseg_dataset/ADE20K/ADEChallengeData2016
mseg_dataset/ADE20K/ADEChallengeData2016/annotations
mseg_dataset/ADE20K/ADEChallengeData2016/annotations_semseg150
mseg_dataset/ADE20K/ADEChallengeData2016/annotations_semseg150_relabeled
mseg_dataset/ADE20K/ADEChallengeData2016/images
mseg_dataset/BDD
mseg_dataset/BDD/bdd100k
mseg_dataset/BDD/bdd100k/seg
mseg_dataset/BDD/bdd100k/seg_relabeled
mseg_dataset/Camvid
mseg_dataset/Camvid/701_StillsRaw_full
mseg_dataset/Camvid/Labels32-RGB
mseg_dataset/Camvid/semseg11
mseg_dataset/Cityscapes
mseg_dataset/Cityscapes/gtFine
mseg_dataset/Cityscapes/gtFine_19cls
mseg_dataset/Cityscapes/gtFine_19cls_relabeled
mseg_dataset/Cityscapes/gtFine_19cls_relabeled/train
mseg_dataset/Cityscapes/gtFine_19cls_relabeled/val
mseg_dataset/Cityscapes/gtFine_19cls/train
mseg_dataset/Cityscapes/gtFine_19cls/val
mseg_dataset/Cityscapes/gtFine/test
mseg_dataset/Cityscapes/gtFine/train
mseg_dataset/Cityscapes/gtFine/val
mseg_dataset/Cityscapes/leftImg8bit
mseg_dataset/Cityscapes/leftImg8bit/test
mseg_dataset/Cityscapes/leftImg8bit/train
mseg_dataset/Cityscapes/leftImg8bit/val
mseg_dataset/COCOPanoptic
mseg_dataset/COCOPanoptic/annotations
mseg_dataset/COCOPanoptic/annotations/__MACOSX
mseg_dataset/COCOPanoptic/annotations/panoptic_train2017
mseg_dataset/COCOPanoptic/annotations/panoptic_val2017
mseg_dataset/COCOPanoptic/images
mseg_dataset/COCOPanoptic/images/train2017
mseg_dataset/COCOPanoptic/images/val2017
mseg_dataset/COCOPanoptic/semantic_annotations133
mseg_dataset/COCOPanoptic/semantic_annotations133/train2017
mseg_dataset/COCOPanoptic/semantic_annotations133/val2017
mseg_dataset/COCOPanoptic/semantic_annotations201
mseg_dataset/COCOPanoptic/semantic_annotations201/train2017
mseg_dataset/COCOPanoptic/semantic_annotations201/val2017
mseg_dataset/COCOPanoptic/semantic_relabeled133
mseg_dataset/COCOPanoptic/semantic_relabeled133/train2017
mseg_dataset/COCOPanoptic/semantic_relabeled133/val2017
mseg_dataset/IDD
mseg_dataset/IDD/IDD_Segmentation
mseg_dataset/IDD/IDD_Segmentation/gtFine
mseg_dataset/IDD/IDD_Segmentation/gtFine39
mseg_dataset/IDD/IDD_Segmentation/gtFine39_relabeled
mseg_dataset/IDD/IDD_Segmentation/leftImg8bit
mseg_dataset/KITTI
mseg_dataset/KITTI/testing
mseg_dataset/KITTI/testing/image_2
mseg_dataset/KITTI/training
mseg_dataset/KITTI/training/image_2
mseg_dataset/KITTI/training/instance
mseg_dataset/KITTI/training/label19
mseg_dataset/KITTI/training/semantic
mseg_dataset/KITTI/training/semantic_rgb
mseg_dataset/MapillaryVistasPublic
mseg_dataset/MapillaryVistasPublic/testing
mseg_dataset/MapillaryVistasPublic/testing/images
mseg_dataset/MapillaryVistasPublic/testing/instances
mseg_dataset/MapillaryVistasPublic/testing/labels
mseg_dataset/MapillaryVistasPublic/testing/panoptic
mseg_dataset/MapillaryVistasPublic/training
mseg_dataset/MapillaryVistasPublic/training/images
mseg_dataset/MapillaryVistasPublic/training/instances
mseg_dataset/MapillaryVistasPublic/training/labels
mseg_dataset/MapillaryVistasPublic/training/panoptic
mseg_dataset/MapillaryVistasPublic/training_semseg65
mseg_dataset/MapillaryVistasPublic/training_semseg65/labels
mseg_dataset/MapillaryVistasPublic/training_semseg65_relabeled
mseg_dataset/MapillaryVistasPublic/training_semseg65_relabeled/labels
mseg_dataset/MapillaryVistasPublic/validation
mseg_dataset/MapillaryVistasPublic/validation/images
mseg_dataset/MapillaryVistasPublic/validation/instances
mseg_dataset/MapillaryVistasPublic/validation/labels
mseg_dataset/MapillaryVistasPublic/validation/panoptic
mseg_dataset/MapillaryVistasPublic/validation_semseg65
mseg_dataset/MapillaryVistasPublic/validation_semseg65/labels
mseg_dataset/MapillaryVistasPublic/validation_semseg65_relabeled
mseg_dataset/MapillaryVistasPublic/validation_semseg65_relabeled/labels
mseg_dataset/PASCAL_Context
mseg_dataset/PASCAL_Context/JPEGImages
mseg_dataset/PASCAL_Context/Segmentation_GT_460cls
mseg_dataset/PASCAL_Context/Segmentation_GT_60cls
mseg_dataset/PASCAL_VOC_2012
mseg_dataset/PASCAL_VOC_2012/JPEGImages
mseg_dataset/PASCAL_VOC_2012/SegmentationClassAug
mseg_dataset/ScanNet
mseg_dataset/ScanNet/scannet_frames_25k
mseg_dataset/SUNRGBD
mseg_dataset/SUNRGBD/image
mseg_dataset/SUNRGBD/image/test
mseg_dataset/SUNRGBD/image/train
mseg_dataset/SUNRGBD/label13
mseg_dataset/SUNRGBD/label13/test
mseg_dataset/SUNRGBD/label13/train
mseg_dataset/SUNRGBD/label38
mseg_dataset/SUNRGBD/label38/test
mseg_dataset/SUNRGBD/label38/train
mseg_dataset/SUNRGBD/semseg-label37
mseg_dataset/SUNRGBD/semseg-label37/test
mseg_dataset/SUNRGBD/semseg-label37/train
mseg_dataset/SUNRGBD/semseg-relabeled37
mseg_dataset/SUNRGBD/semseg-relabeled37/test
mseg_dataset/SUNRGBD/semseg-relabeled37/train
mseg_dataset/WildDash
mseg_dataset/WildDash/anonymized
mseg_dataset/WildDash/wd_both_01
mseg_dataset/WildDash/wd_val_01
mseg_dataset/WildDash/wd_val_19class

Cityscapes was already available on our file server, but I indeed downloaded WildDash and BDD with Chrome on macOS and then uploaded them to my machine. But they still seemed to be handled fine by their scripts.

I'm afraid I don't have any output logs but I don't remember any errors either.

johnwlambert commented 4 years ago

Hi @jmsteitz, those directories all look correct, which is good. I may add output logs directly to the instructions for future users.

Since ade20k-151 seems to be throwing issues for you, can we check some paths in there? for example, do these files exist in your directory?

mseg_dataset/ADE20K/ADEChallengeData2016/images/training/ADE_train_00000001.jpg

mseg_dataset/ADE20K/ADEChallengeData2016/annotations/training/ADE_train_00000001.png

johnwlambert commented 4 years ago

Could you let me know the exact filepath it's breaking on for you? This part of the script ran fine for me.

jmsteitz commented 4 years ago

Since ade20k-151 seems to be throwing issues for you, can we check some paths in there? for example, do these files exist in your directory?

mseg_dataset/ADE20K/ADEChallengeData2016/images/training/ADE_train_00000001.jpg

mseg_dataset/ADE20K/ADEChallengeData2016/annotations/training/ADE_train_00000001.png

Yes, both paths exist. But I guess, it's not specifically ade20k-151 but a more general problem, because when I skip the ADE20K dataset in the verification script it still throws the error (but with BDD then).

johnwlambert commented 4 years ago

Got it, thanks for your patience while we try to figure this out.

Thats good those exist. This part of the script is just looking for paths, so we need to find out which line it breaks on for you, so we can see if a path is really missing. Do you mind setting a conditional breakpoint and stepping through it, or adding a print statement in the loop? Also, did you set the variable in mseg/utils/dataset_confif.py?

Hoping we can avoid the situation where you would have to rerun everything and check output logs for an error along the way.

jmsteitz commented 4 years ago

It's directly breaking on the first set, which is

<MSEG_DIR>/mseg_dataset/ADE20K/ADEChallengeData2016/images/training/ADE_train_00000001.jpg
<MSEG_DIR>/mseg_dataset/ADE20K/ADEChallengeData2016/annotations/training/ADE_train_00000001.png

and where we already made sure it exists.

It seems, that we somehow end up with an all zero segment_mask in overlay_instances which leads to the error. Any idea, where I should look into to track that further down?

johnwlambert commented 4 years ago

Got it, ok that's helpful. What does

import imageio
import numpy as np
label_fpath = '/fastdata/jmsteitz/mseg_dataset/ADE20K/ADEChallengeData2016/annotations/training/ADE_train_00000001.png'
print(np.unique(imageio.imread(label_fpath)))

return to you? I get

[  0   1   4   5   6  13  18  32  33  43  44  88  97 105 126 139 150]

johnwlambert commented 4 years ago

Is your

>>> import cv2
>>> cv2.__version__

either 4.1.0 or '4.2.0'?

Can you also run pytest tests/ to make sure unit tests are passing locally for you?

jmsteitz commented 4 years ago

The unit tests all pass, my cv2 version is 4.2.0 and I got the exact same output when printing the labels as you do.

johnwlambert commented 4 years ago

Got it, that's good. Do you mind setting a breakpoint before it crashes using pdb.set_trace() and stepping through it to see why it's crashing on your machine?

jmsteitz commented 4 years ago

Well, segment_mask in mask_utils.get_most_populous_class() is all zero, so segment_mask.nonzero() produces an empty set, which leads to class_indices being empty. The numpy error is expected then.

So I had a look into mask_utils.get_mask_from_polygon() and found an odd behaviour of ImageDraw. When the the vertices all have the same x-value, we get a line as mask. But when the vertices all have the same y-value, we end up with an empty mask instead. Please have a look at the following example code:

>>> import numpy as np
>>> from PIL import Image, ImageDraw
>>> img = Image.new("L", size=(8,8), color=0)
>>> ImageDraw.Draw(img).polygon([(2,2),(2,8)], outline=1, fill=1)
>>> np.array(img)
array([[0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0]], dtype=uint8)
>>> img = Image.new("L", size=(8,8), color=0)
>>> ImageDraw.Draw(img).polygon([(2,2),(8,2)], outline=1, fill=1)
>>> np.array(img)
array([[0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)

The first list of vertices to build segment_mask is

polygon: [(320, 499), (321, 499), (322, 499), (323, 499), (324, 499), (325, 499), (326, 499), (327, 499), (328, 499), (329, 499), (330, 499), (331, 499), (332, 499), (333, 499), (334, 499), (335, 499), (336, 499), (337, 499), (338, 499), (339, 499), (340, 499), (341, 499), (342, 499), (343, 499), (344, 499), (345, 499), (346, 499), (347, 499), (348, 499), (349, 499), (350, 499), (351, 499), (352, 499), (353, 499), (354, 499), (355, 499), (356, 499), (357, 499), (358, 499), (359, 499), (360, 499), (361, 499), (362, 499), (363, 499), (364, 499), (365, 499), (364, 499), (363, 499), (362, 499), (361, 499), (360, 499), (359, 499), (358, 499), (357, 499), (356, 499), (355, 499), (354, 499), (353, 499), (352, 499), (351, 499), (350, 499), (349, 499), (348, 499), (347, 499), (346, 499), (345, 499), (344, 499), (343, 499), (342, 499), (341, 499), (340, 499), (339, 499), (338, 499), (337, 499), (336, 499), (335, 499), (334, 499), (333, 499), (332, 499), (331, 499), (330, 499), (329, 499), (328, 499), (327, 499), (326, 499), (325, 499), (324, 499), (323, 499), (322, 499), (321, 499)]
(array([], dtype=int64), array([], dtype=int64))

so it generates an empty mask and leads to the numpy error in the end. My Pillow version is 7.1.1 by the way.

Does your first segment_mask or list of vertices look different?

johnwlambert commented 4 years ago

If I append two print statments after that line in mask_utils_detectron2.py:

class_mode_idx = get_most_populous_class(segment_mask, label_map); print('segment_mask sum: ', segment_mask.sum() ); print('class_mode_idx ', class_mode_idx)

I get the following output. Could you check the diff? Thanks again for your patience here.

jmsteitz commented 4 years ago

As mentioned in my earlier post, segment_mask is all zero. As a result get_most_populous_classcrashes.

When I insert print('segment_mask sum: ', segment_mask.sum() ) before the line you mentioned, I get segment_mask sum: 0 of course.

Could you uncomment print('Polygon verts: ', polygon_verts) in mask_utils_detectron2.overlay_instances please and share your first list of vertices?

johnwlambert commented 4 years ago

Perhaps it is a PIL version issue? I get a PIL version far behind yours:

>>> import PIL
>>> PIL.__version__
'6.1.0'
>>> import cv2
>>> cv2.__version__
'4.1.0'
>>> import matplotlib
>>> matplotlib.__version__
'3.1.0'
>>> import imageio
>>> imageio.__version__
'2.5.0'

Sure, for 0 of ade20k-151, I get the following for polygon_verts

array([[320, 499],[321, 499],[322, 499],[323, 499],[324, 499],[325, 499],[326, 499],[327, 499],[328, 499],[329, 499],[330, 499],[331, 499],[332, 499],[333, 499],[334, 499],[335, 499],[336, 499],[337, 499],[338, 499],[339, 499],[340, 499],[341, 499],[342, 499],[343, 499],[344, 499],[345, 499],[346, 499],[347, 499],[348, 499],[349, 499],[350, 499],[351, 499],[352, 499],[353, 499],[354, 499],[355, 499],[356, 499],[357, 499],[358, 499],[359, 499],[360, 499],[361, 499],[362, 499],[363, 499],[364, 499],[365, 499],[364, 499],[363, 499],[362, 499],[361, 499],[360, 499],[359, 499],[358, 499],[357, 499],[356, 499],[355, 499],[354, 499],[353, 499],[352, 499],[351, 499],[350, 499],[349, 499],[348, 499],[347, 499],[346, 499],[345, 499],[344, 499],[343, 499],[342, 499],[341, 499],[340, 499],[339, 499],[338, 499],[337, 499],[336, 499],[335, 499],[334, 499],[333, 499],[332, 499],[331, 499],[330, 499],[329, 499],[328, 499],[327, 499],[326, 499],[325, 499],[324, 499],[323, 499],[322, 499],[321, 499]], dtype=int32)

If the Visualizer() crashes, there should be an image saved beforehand from an OpenCV based function form_mask_triple_embedded_classnames. Here is the output of that function: ade20k-151_ADE_train_00000001 I get the following with the Visualizer: ade20k-151_ADE_train_00000001_blended

johnwlambert commented 4 years ago

Thanks for adding the minimal code example -- a line segment shows up in my array, unlike on your machine:

>>> import numpy as np
>>> from PIL import Image, ImageDraw
>>> img = Image.new("L", size=(8,8), color=0)
>>> ImageDraw.Draw(img).polygon([(2,2),(2,8)], outline=1, fill=1)
>>> np.array(img)
array([[0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0]], dtype=uint8)
>>> img = Image.new("L", size=(8,8), color=0)
>>> ImageDraw.Draw(img).polygon([(2,2),(8,2)], outline=1, fill=1)
>>> np.array(img)
array([[0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)

jmsteitz commented 4 years ago

Yes, it seems to be a Pillow bug, see here. I've created a PR (#10) with a workaround, that uses Draw.line instead of Draw.polygon, when encountering polygons where all vertices are on a horizontal line.

My colors in the overlay produced by the Visualizer are different than yours, though. Is that a problem? ade20k-150_ADE_train_00000001_blended

Now, the verification script runs until pascal-context-460 where it crashes because of an assertion error in get_np_mode(). For me the data type is np.uint16 which is not in the list. Is it save to just add it?

johnwlambert commented 4 years ago

Hi @jmsteitz, thanks for all this feedback. I've pushed several changes that incorporate your polygon vs. line fix for horizontal lines.

johnwlambert commented 4 years ago

My rendering colors now match yours: ade20k-150_ADE_train_00000001_blended . I'm looking into the pascal-context-460 crash, I was able to reproduce it on a new machine, which suggests another versioning issue.

johnwlambert commented 4 years ago

@jmsteitz I merged in the np.uint16 support in get_np_mode(): https://github.com/mseg-dataset/mseg-api/pull/13.

I don't want to relax all of the other type constraints to generic integer, though.

johnwlambert commented 4 years ago

Hi @jmsteitz , let me know if you run into any further issues. I'm closing this for now since I merged fixes in two PRs, but feel free to re-open it if your issue is not resolved.

mseg-dataset / mseg-api

Error in dataset path verification #7