Closed rose-jinyang closed 3 years ago
According to the code, both images and mask have three channels (see init of dataloader class).
Now if you see the description in section ' Data-loader and Augmentations' in ipynb, it says: If your masks are not in raw format, then you need to convert them into sparse labels(color indexed) for training with SparseCategoricalCrossentropy loss (i.e 0 for bg and 1 for fg)
Hi Thanks for your reply. You used the pre-processed of AISegment in slim512.ipynb. Could u share the dataset or its part? I want to know if your mask PNG file format is same as the original AISegment mask(matting) image. Thanks
Off-course, its preproceesed before training and therefore it's different. Now it becomes a binary mask as i mentioned previously(0 or 1 pixel values only), in png format.
If you are concerend about decode_jpeg in the code, check out stackoverflow
What is the structure of each image in msk_uint8.npy.
This is your code that make a numpy file for mask image.
Well, you can easily try the code on a folder with images and see for yourself. The final numpy array dataset is in format NHWC i.e N- number of images, H,W are height and width C i.e channels will be 1 in this code for mask images(as per keras requirements). In this case the loss function is differrent i.e BCE loss and we use sigmoid as last activation(see portrait_segmentation.ipynb). Also ,here mask values are 0 and 255.
So, in the aforementioned code each image would a mask(binary) with 1 channel dimension say 128x128x1, if input image is of size 128..
But we are not using this code in slim-net ; we directly use the image paths of preprocessed images in the data loader class , SparseCategoricalCrossentropy loss as loss function and mask values 0 or 1.
Thanks for your kind explanation
Hi Could u provide a script to convert the original AISegment mask image? Thanks
Sorry, i think there should be a clarification .. Actually if you see the code, the mask images in the datasets should have 1 channel. The value 3 in the loader class is just a default value;but we override them when we call the data loader as shown below.
# Initialize the dataloader object
train_dataset = DataLoader(image_paths=train_image_paths,
mask_paths=train_mask_paths,
image_size=512,
crop_percent=0.8,
channels=[3, 1], # **here 1 refers to the mask channel**
seed=47)
So when you prepare alpha mask from aisegment dataset, single channel is sufficient.
Now, here is a rough idea for the preprocessing the aisegment dataset mask
import numpy as np
import cv2
import imageio
# Save the rgb input image
image= cv2.imread('jpg image file path') // Original input image
image=image[...,0:3] // Only include RGB channels from original image
imageio.imsave('image_1.jpg',image)
# Save the alpha mask from matting masks
in_image = cv2.imread('png image file path', cv2.IMREAD_UNCHANGED) // matting mask of aisegment dataset
alpha = in_image[:,:,3] // Get alpa channel from 4 channel matting mask
// Convert to binary mask
alpha[alpha>=127]=1
alpha[alpha<127]=0
// Now save the binary mask with single channel
imageio.imsave('alpha_1.png',alpha)
Thank you very much.
Hi By your help, I started to train a new model with Slim-Net on AISegment dataset. But The training acc is 0.9688 and validation acc is 1.0 after the first epoch. After the second epoch, both the training acc and validation acc are 1.0.
How should I understand this?
Hello By your help, I started to train a new model with Slim-Net on AISegment dataset. But The training acc is 0.9688 and validation acc is 1.0 after the first epoch. After the second epoch, both the training acc and validation acc are 1.0.How should I understand this?
On Monday, September 7, 2020, 09:12:14 PM GMT+8, anilsathyan <notifications@github.com> wrote:
Sorry, i think there should be a clarification .. Actually if you see the code, the mask images in the datasets should have 1 channel. The value 3 in the loader class is just a default value;but we override them when we call the data loader as shown below.
train_dataset = DataLoader(image_paths=train_image_paths, mask_paths=train_mask_paths, image_size=512, crop_percent=0.8, channels=[3, 1], # here 1 refers to the mask channel seed=47)
So when you prepare alpha mask from aisegment dataset, single channel is sufficient.
Now, here is a rough idea for the preprocessing the aisegment dataset mask import numpy as np import cv2 import imageio
image= cv2.imread('jpg image file path') // Original input image image=image[...,0:3] // Only include RGB channels from original image imageio.imsave('image_1.jpg',image)
in_image = cv2.imread('png image file path', cv2.IMREAD_UNCHANGED) // matting mask of aisegment dataset alpha = in_image[:,:,3] // Get alpa channel from 4 channel matting mask
// Convert to binary mask alpha[alpha>=127]=1 alpha[alpha<127]=0
// Now save the binary mask with single channel imageio.imsave('alpha_1.png',alpha)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
First, test the model and see if it's giving correct results. Then compare the hdf5 model with original model to see if it has the same structure Finally also check if masks are correctly preprocessed.
I'am not sure what exactly is the cause...
Thanks
Hi I found the following mistake in your slim512.ipynb.
Of course, I think that this is not a main reason.
Here is a sample image and mask from the dataset. Check if there is any dissimilarity 1803151818-00000003.zip
Thanks
It should not be an issue since the two dimesions are redundant. and in the preprocessing code we remove these channles. Anyway try with 3 channels and see if the problem persist.
It shoudl not be an issue i guess, load as mask = cv2.imread('1803151818-00000003.png') ,save png mask in 513x513x3 and see if works
Will try
If you have this exact image in your previous preprocessd dataset, just see if they are exactly same with the sample mask image i.e pixelwise eactly same or not?
Sure
Hi I found that there is an issue in my pre-processing code. Your guess is correct. Thank you
Hi May I ask one more? Why did u use SparseCategoricalCrossentropy rather than binary_crossentropy in slim-net training?
You can use both in this case, where there are only two classes.
Also, sparse version just helps us to avoid creating a one hot version of labels when there are multiple classes.
Thanks Excuse me, have u ever implemented SINet for Portrait Segmentation? https://github.com/clovaai/ext_portrait_segmentation If so, I want to know the compared result with Slim-Net and SINet in accuracy. Thanks
No, but I initially tried the eg1800 combined dataset from their repo...
After some experimentation it seems real world accuracy depends on having better dataset(clean and bigger), bigger model, bigger input size etc, especially becoz nowadays there are faster processors to deal with them.
Also sometimes it doesn't matter much in your specific use-case if there is 2 or 3 percentage difference in test set accuracy(say 95 vs 97), provided you attain required fps.
Thanks
Anyway it's interesting in the research perspective...
Hello How are you? Thanks for contributing this project. I tried to train a model with slim-net on AISegment dataset but met the following issue.
2 226 229 232 235 236 234 236 240 242 243 252 236 199 163 199 225 200 133 148 189 190 221 222 236 237 238 239 240 241 241 239 240 241 242 242 242 241 242 242 241 240 239 238 238 235 237 232 238 246 234 211 183 158 143 173 189 180 215 220 241 239 239 247 241 237 238 239 241 241 240 239 240 238 237 237 237 235 234 234 234 233 232 230 229 228 227 228 228 228 227 226 225 223 223 223 223 223 223 226 227 229 231 232 229 228 219 223 195 196 211 217 207 195 199 206 199 183 172 169 198 210 199 180 182 186 206 201 200 230 233 236 239 196 154 131 120 144 174 165 167 167 168 169 171 175 175 178 177 176 176 177 179 180 180 182 183 185 186 190 192 199 226 208 208 200 186 193 197 199 199 199 198 199 206 205 202 200 200 201 204 205 206 210 212 213 214 215 218 218 220 222 224 226 228 230 230 231 231 231 232 232 231 235 236 235 232 228 226 229 231 231 230 228 228 225 226 227 228 230 231 233 230 230 228 228 228 230 230 225 226 229 229 228 227 225 223 225 176 151 141 138 134 140 137 139 144 148 150 152 164 169 174 176 180 185 201 202 206 210 215 220 222 223 223 224 225 226 227 227 228 228 228 228 228 228 228 228 228 228 229 230 231 232 230 228 228 228 228 228 [[{{node loss/conv2d_transpose_4_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]
I think that this may be relative to label value range. I used the entire AISegment dataset. A mask image in AISegment dataset is a PNG format with 4 channels. How should I decode this mask image in dataloader? Should I load this mask image as grayscale? I found an strange part to load mask image in your slim512.ipynb. Thanks