Doodleverse / segmentation_gym

A neural gym for training deep learning models to carry out geoscientific image segmentation. Works best with labels generated using
MIT License
45 stars 11 forks source link

Issue with make_nd_datasets #92

Closed sbosse12 closed 2 years ago

sbosse12 commented 2 years ago

Gym is having an issue with reshaping array size. It seems to be a problem with the target size. I have tried many different target sizes yet have had no luck. the error is below as well as config file.

(gym) C:\Users\sbosse\segmentation_gym>python C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/npzForModel C:/Users/sbosse/segmentation_gym/my_seggym_datasets/config/2022_DeCoast_watermask_nadir_2class_batch6.json Using GPU C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images Found 500 image and 500 label files joblib.externals.loky.process_executor._RemoteTraceback: """ Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\externals\loky\", line 428, in _process_worker r = call_item() File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\externals\loky\", line 275, in call return self.fn(*self.args, self.kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\", line 620, in call return self.func(*args, *kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\", line 288, in call return [func(args, kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\", line 288, in return [func(*args, **kwargs) File "C:\Users\sbosse\segmentation_gym\", line 75, in do_resize_label result = scale(lab,TARGET_SIZE[0],TARGET_SIZE[1]) File "C:\Users\sbosse\segmentation_gym\", line 53, in scale return np.array(tmp).reshape((nR,nC)) ValueError: cannot reshape array of size 13284432 into shape (1719,2576) """

The above exception was the direct cause of the following exception: "TARGET_SIZE": [1719,2576], "MODEL": "resunet", "NCLASSES": 2, "BATCH_SIZE": 6, "N_DATA_BANDS": 3, "DO_TRAIN": true, "PATIENCE": 25, "MAX_EPOCHS": 200, "VALIDATION_SPLIT": 0.75, "FILTERS":6, "KERNEL":7, "STRIDE":2, "LOSS": "dice", "DROPOUT":0.1, "DROPOUT_CHANGE_PER_LAYER":0.0, "DROPOUT_TYPE":"standard", "USE_DROPOUT_ON_UPSAMPLING":false, "ROOT_STRING": "DE_Coast_water_mask", "FILTER_VALUE": 3, "DOPLOT": true, "USEMASK": false, "RAMPUP_EPOCHS": 20, "SUSTAIN_EPOCHS": 0.0, "EXP_DECAY": 0.9, "START_LR": 1e-7, "MIN_LR": 1e-7, "MAX_LR": 1e-5, "AUG_ROT": 0, "AUG_ZOOM": 0.0, "AUG_WIDTHSHIFT": 0.05, "AUG_HEIGHTSHIFT": 0.05, "AUG_HFLIP": true, "AUG_VFLIP": true, "AUG_LOOPS": 3, "AUG_COPIES": 2, "TESTTIMEAUG": false, "SET_GPU": "0", "do_crf": true, "SET_PCI_BUS_ID": true

Traceback (most recent call last): File "C:\Users\sbosse\segmentation_gym\", line 251, in w = Parallel(n_jobs=-2, verbose=0, max_nbytes=None)(delayed(do_resize_label)(os.path.normpath(lfile), TARGET_SIZE) for lfile in label_files) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\", line 1098, in call self.retrieve() File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\", line 975, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\", line 567, in wrap_future_result return future.result(timeout=timeout) File "C:\ProgramData\Anaconda3\envs\gym\lib\concurrent\", line 446, in result return self.get_result() File "C:\ProgramData\Anaconda3\envs\gym\lib\concurrent\", line 391, in get_result raise self._exception ValueError: cannot reshape array of size 13284432 into shape (1719,2576)

dbuscombe-usgs commented 2 years ago

Hi Stephen, I don't think a "TARGET_SIZE" of [1719,2576] is supported. Could you try [768, 1024].

Also, did you update to the latest doodleverse-utils? If not, from within an activated gym conda env, pip install doodleverse-utils -U (U stands for upgrade, and it should upgrade to the latest version 0.0.10)

Also note that with the new changes (should be the last major changes for a while!), NCLASSES=2 means 'class and no class' (replaces the former NCLASSES=1)

sbosse12 commented 2 years ago

Hi Dan,

I've tried [768, 1024] and [768, 768]. No luck. I'll run that update though and see if that helps

dbuscombe-usgs commented 2 years ago

Looks like NCLASSES should be 3? The error is

"ValueError: cannot reshape array of size 13284432 into shape (1719,2576)"

from the code that resizes the labels.

1719 x 2576 x 3 = 13284432

dbuscombe-usgs commented 2 years ago

I may be wrong that "TARGET_SIZE" of [1719,2576] is not supported (I have not tried), but I was under the impression odd dimensions such as 1719 would not work

sbosse12 commented 2 years ago

Is that correct if I'm only doing water masking?

sbosse12 commented 2 years ago

I had [768, 1024] target size originally, but it gave me that error so thought I would try the dimensions of the imagery I'm training. Still got the error, so tried [768, 768] and no good.

dbuscombe-usgs commented 2 years ago

This is puzzling. You are correct that NCLASSES=2 for water/nowater. Have you verified you are using the latest doodleverse-utils version?

If so, can you zip up 10 pairs of images and labels and post them here? Perhaps the target size is a red herring and the true problem is something else ....

sbosse12 commented 2 years ago

Here ya go! I updated seg gym last night, and when I ran that doodleverse utils update it said: Requirement already satisfied: doodleverse-utils in c:\programdata\anaconda3\envs\gym\lib\site-packages (0.0.10) Requirement already satisfied: versioneer in c:\programdata\anaconda3\envs\gym\lib\site-packages (from doodleverse-utils) (0.26)

By that, I took it the updates were already made. "TARGET_SIZE": [768, 1024], "MODEL": "resunet", "NCLASSES": 2, "BATCH_SIZE": 6, "N_DATA_BANDS": 3, "DO_TRAIN": true, "PATIENCE": 25, "MAX_EPOCHS": 200, "VALIDATION_SPLIT": 0.75, "FILTERS":6, "KERNEL":7, "STRIDE":2, "LOSS": "dice", "DROPOUT":0.1, "DROPOUT_CHANGE_PER_LAYER":0.0, "DROPOUT_TYPE":"standard", "USE_DROPOUT_ON_UPSAMPLING":false, "ROOT_STRING": "DE_Coast_water_mask", "FILTER_VALUE": 3, "DOPLOT": true, "USEMASK": false, "RAMPUP_EPOCHS": 20, "SUSTAIN_EPOCHS": 0.0, "EXP_DECAY": 0.9, "START_LR": 1e-7, "MIN_LR": 1e-7, "MAX_LR": 1e-5, "AUG_ROT": 0, "AUG_ZOOM": 0.0, "AUG_WIDTHSHIFT": 0.05, "AUG_HEIGHTSHIFT": 0.05, "AUG_HFLIP": true, "AUG_VFLIP": true, "AUG_LOOPS": 3, "AUG_COPIES": 2, "TESTTIMEAUG": false, "SET_GPU": "0", "do_crf": true, "SET_PCI_BUS_ID": true

ebgoldstein commented 2 years ago

Hi @sbosse12 - I see the images in the zip, but not the actual labels.. instead of labels (which should be greyscale images, super dark) i see the colorized labels. Do you have the greyscale masks?

ebgoldstein commented 2 years ago

If these are the labels you are using, I would guess that make_nd_datasets is failing because it expects a label to be 1 band image (greyscale) vs this 3 band image..

sbosse12 commented 2 years ago

After converting to greyscale (B&W), I am getting this message below. (attached images w/ bw labels)

it's trying to resize the images and labels and put them in resized label/image folders. It populates the image folder with new size [768, 1024], but not the label folder. Which is then causing it to crash.

(gym) C:\Users\sbosse\segmentation_gym>python C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/npzForModel C:/Users/sbosse/segmentation_gym/my_seggym_datasets/config/2022_DeCoast_watermask_nadir_2class_batch6.json Using GPU C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images Found 500 image and 500 label files C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/resized_labels/resized_labels already exists: skipping the image resizing step 0 label files 1 sets of 500 image files Creating non-augmented subset Version: 2.6.0 Eager mode: True 2022-10-18 09:02:28.129971: I tensorflow/core/platform/] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-10-18 09:02:28.694129: I tensorflow/core/common_runtime/gpu/] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 8964 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:3b:00.0, compute capability: 7.5 Traceback (most recent call last): File "C:\Users\sbosse\segmentation_gym\", line 406, in dataset =, shuffle=False) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\data\ops\", line 1229, in list_files assert_not_empty = control_flow_ops.Assert( File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\util\", line 206, in wrapper return target(*args, *kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\util\", line 247, in wrapped return _add_should_use_warning(fn(args, **kwargs), File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\ops\", line 160, in Assert raise errors.InvalidArgumentError( tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'No files matched pattern: '

ebgoldstein commented 2 years ago

Hi @sbosse12 - the issue here is that the label names do not correspond to the image names (hence the No files matched pattern error message). I see this in the folder you sent: the image is: 2022-0610-152545-DSC02130-N7251F.jpg and the label is: 2022-0610-152545-DSC02130-N7251F_label2022-09-06-10-48_Enter-user-ID.jpg The labels seems to have duplicated part of the string in the naming.. one idea is that you can strip off the 2nd part of the name in all the labels (here it would be 2022-09-06-10-48_Enter-user-ID).. but i am not sure how you converted the RGBs to grayscale and if that is going to lead to problems.. Specifically, its not that the image itself is grayscale, its that each label is 'label encoded' - class 1 is all 0 values, class 2 is all 1 values, etc... if you converted the RGB to grayscale just with, say, imagemagick, then the labels might not have specific, needed mapping of pixel values...

SO - i think we should just back up for a moment - I am guessing you are working from Doodled output right? If this is the case, can i ask that you follow this workflow and see how it goes:

1) activate the doodler conda environment 2) navigate back into the doodler directory 3) navigate into /utils/ 4) run `', and select the folder of doodler results you want to process (you may need to collect all your doodler output into a single folder 5) Once the script is done, navigate to the results folder you selected. 6) the output of this script is several new folders in the doodler results folder. Two will be 'images' and 'labels', and they will be named correctly.. 7) copy/paste those two folders into your Gym directory, and try using those with Make_datasets..

this is an 'official' way to get images and labels from doodler -> gym, and should work..

sbosse12 commented 2 years ago

Hi Evan,

I did a small test run where I removed the extra portion of the file name including date and time of doodle, so that each label image has the name filename_label.jpg. This also did not work. I used irfan view to convert the imagery. In the advance batch conversion options, I selected change color depth then 2 colors (B/W).

I'll give the doodler utility's a try though and let you know!

ebgoldstein commented 2 years ago

yep, that is what i would expect - that it would not work.. The labels are not just normal greyscale images, but have that special 'label' encoding..

Yes, please use the doodler pipeline and report back!

dbuscombe-usgs commented 2 years ago

@sbosse12 any update?

If you have B/W images, that means you probably have pixels that are 0 and 255. This can be dealt with using "REMAP_CLASSES": {"0":0, "255":1} which will reclassify the 255 as 1 ...

But the Doodler workflow (utils/`') is probably best overall - simpler, well tested, and no extra steps involving 3rd party software

sbosse12 commented 2 years ago

Hi Dan and Evan,

I generated a new set of image/label files (a few attached in zip)

I am getting an error now where the folders are not getting populated to the resized folders, and therefore not being recognized to make the training dataset.

(gym) C:\Users\sbosse\segmentation_gym>python C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/npzForModel C:/Users/sbosse/segmentation_gym/my_seggym_datasets/config/2022_DeCoast_watermask_nadir_2class_batch6.json Using GPU C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images Found 500 image and 500 label files C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/resized_labels/resized_labels already exists: skipping the image resizing step 0 label files 1 sets of 0 image files Creating non-augmented subset Version: 2.6.0 Eager mode: True 2022-10-20 11:02:15.364899: I tensorflow/core/platform/] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-10-20 11:02:15.923135: I tensorflow/core/common_runtime/gpu/] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 8964 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:3b:00.0, compute capability: 7.5 Traceback (most recent call last): File "C:\Users\sbosse\segmentation_gym\", line 406, in dataset =, shuffle=False) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\data\ops\", line 1229, in list_files assert_not_empty = control_flow_ops.Assert( File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\util\", line 206, in wrapper return target(*args, *kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\util\", line 247, in wrapped return _add_should_use_warning(fn(args, **kwargs), File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\ops\", line 160, in Assert raise errors.InvalidArgumentError( tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'No files matched pattern: '

ebgoldstein commented 2 years ago

Hi @sbosse12 ,

First, please upgrade gym (or just download the new version of makedatastes) - i made a very small change (

Second, with the labels, images, and configs you provide, using a fresh install of gym, make dataset works for me:


My thinking is that, since the folders exist, the entire resizing operation is being skipped:

C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/resized_labels/resized_labels already exists: skipping the image resizing step

However, from the next line in the output, it seems like the resize folders on your machine are empty:

C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/resized_labels/resized_labels already exists: skipping the image resizing step 0 label files 1 sets of 0 image files

The easiest solution right now is for you to just delete the two folders resized_images and resized_labels. Then make_datasets will remake these folders and remake the images..

can you try this fix and report back?

sbosse12 commented 2 years ago

I'll try the update, whenever I delete them and run it says the folders don't exist and therefore skips the step again

sbosse12 commented 2 years ago

after the update, and deleting resize folders I got this

(gym) C:\Users\sbosse\segmentation_gym>python C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/npzForModel C:/Users/sbosse/segmentation_gym/my_seggym_datasets/config/2022_DeCoast_watermask_nadir_2class_batch6.json Using GPU C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images Found 500 image and 500 label files joblib.externals.loky.process_executor._RemoteTraceback: """ Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\externals\loky\", line 428, in _process_worker r = call_item() File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\externals\loky\", line 275, in call return self.fn(*self.args, self.kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\", line 620, in call return self.func(*args, *kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\", line 288, in call return [func(args, kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\", line 288, in return [func(args, kwargs) File "C:\Users\sbosse\segmentation_gym\", line 104, in do_resize_image imsave(fdirout+os.sep+f.split(os.sep)[-1].replace('.jpg','.png'), result.astype('uint8'), check_contrast=False, compression=0) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\skimage\", line 143, in imsave return call_plugin('imsave', fname, arr, plugin=plugin, plugin_args) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\skimage\io\", line 207, in call_plugin return func(args, kwargs) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\imageio\", line 238, in imwrite with imopen(uri, "wi", imopen_args) as file: File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\imageio\core\", line 118, in imopen request = Request(uri, io_mode, format_hint=format_hint, extension=extension) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\imageio\core\", line 248, in init self._parse_uri(uri) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\imageio\core\", line 412, in _parse_uri raise FileNotFoundError("The directory %r does not exist" % dn) FileNotFoundError: The directory 'C:\Users\sbosse\segmentation_gym\my_seggym_datasets\2022_DE_Coast\FromDoodler\resized_images\resized_images' does not exist """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\sbosse\segmentation_gym\", line 248, in w = Parallel(n_jobs=-2, verbose=0, max_nbytes=None)(delayed(do_resize_image)(os.path.normpath(f), TARGET_SIZE) for f in files) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\", line 1098, in call self.retrieve() File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\", line 975, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\", line 567, in wrap_future_result return future.result(timeout=timeout) File "C:\ProgramData\Anaconda3\envs\gym\lib\concurrent\", line 446, in result return self.get_result() File "C:\ProgramData\Anaconda3\envs\gym\lib\concurrent\", line 391, in get_result raise self._exception FileNotFoundError: The directory 'C:\Users\sbosse\segmentation_gym\my_seggym_datasets\2022_DE_Coast\FromDoodler\resized_images\resized_images' does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\sbosse\segmentation_gym\", line 250, in w = Parallel(n_jobs=-2, verbose=0, max_nbytes=None)(delayed(do_resize_image)(os.path.normpath(f), TARGET_SIZE) for f in files.squeeze()) AttributeError: 'list' object has no attribute 'squeeze'

(gym) C:\Users\sbosse\segmentation_gym>

sbosse12 commented 2 years ago

Ok I got it to work. It appears you need to have a resize folder created, however, you must only create the first level (my_datasets/resized_images compared to my_datasets/resized_images/resized_images, which is what gym suggests needs to exist as shown above). Gym will create a second folder with the same name within it containing all the resized files.

ebgoldstein commented 2 years ago

hmm.. I'm a bit curious as to why there are duplicate 'resized_images` directories?

FileNotFoundError: The directory 'C:\Users\sbosse\segmentation_gym\my_seggym_datasets\2022_DE_Coast\FromDoodler\resized_images\resized_images' does not exist

and for that matter, resized image and label direcotries..

C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images

can you confirm that your directory structure looks exactly like this:

ebgoldstein commented 2 years ago

oh, i see your comment above...

just to clarify - you should not have any folder named resized_images (or nested folders).. and gym will work.. but if you got it working, then that is good ...

I will close this issue if its working for you....

sbosse12 commented 2 years ago

C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images

These are how my folders are organized, nesting the image/label folders was the suggested structure for previous versions of Gym as well as the sample dataset provided on git hub image

ebgoldstein commented 2 years ago

FWIW: no nesting is needed now, on any directories... the WIki link i sent has the current suggested directory structure..

sbosse12 commented 2 years ago

But yes I have a full dataset of augmented and nonaugmented npz's Thank's y'all! Might want to update that downloadable sample dataset zip folder.

sbosse12 commented 2 years ago


This one

dbuscombe-usgs commented 2 years ago

I'm catching up with this thread. Thanks for working through this!

I know I still need to redo the sample dataset. I'll try to get to it in the next few days.

dbuscombe-usgs commented 2 years ago

What exactly was the problem here? That @sbosse12 a) wasn't using the latest version of Gym, and b) had already created the 'resized' folders (even though there is nothing in the instructions that says to do this?)

Sorry, just catching up and want to make sure I understand .... thinking about what changes need to be made to the documentation

sbosse12 commented 2 years ago

I believe the largest issue was using nested folders. Using the nested folders was giving Gym a problem when attempting to create the resized files.