Image with 2 channels - Githubissues

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.77k stars 7.96k forks source link

Image with 2 channels #1162

Open ycui123 opened 6 years ago

ycui123 commented 6 years ago

Can I use this network with images that only contains 2 channels? I'm dealing with x-ray images. The first channel is raw image(16bit grayscale) and the second channel is log transformed image. Does that work? And could you tell me which file should I modify? Thank you !

AlexeyAB commented 6 years ago

The first channel is raw image(16bit grayscale) and the second channel is log transformed image.

If you can convert these 2 channels (1st 16-bit + 2nd 8-bit) to the 8-bit 3-channels (total 24-bit), then just use such images for training and detection as usual.

If you can't convert in such a way, then you should change source code to do this. Look at these changes that were made to support 1-channel 8-bit images: https://github.com/AlexeyAB/darknet/pull/936/files

You should change these functions:

https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/image.c#L936-L954

Also if OpenCV is used:

If OpenCV isn't used:

ycui123 commented 6 years ago

Thanks for the quick reply. I could convert the two channels to 8 bit. And zero pad the 3rd channel? I wonder if that works?

AlexeyAB commented 6 years ago

The first channel is raw image(16bit grayscale) and the second channel is log transformed image.

Do you have 1st channel with 16 bit?
What number of bits in the 2nd channel (second channel is log transformed image)?

ycui123 commented 6 years ago

Yes. And the 2nd channel is also 16 bit since I transformed from the first channel.

AlexeyAB commented 6 years ago

So you can convert it to the common 8-bit 3 channels in any way as you want and it will work:

or convert two 16-bit channels to the two 8-bit channels, and set all zeros in the 3rd channel.
or convert first 16-bit channel to the two 8-bit channels, and second 16-bit channel to the one 8-bit channel

Just you should do Training and Detection on the same type of converting.

Also you should disable some types of color data augmentation, i.e. set

saturation = 1.0
exposure = 1.5 
hue=0

instead of: https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/cfg/yolov3.cfg#L14-L16

ycui123 commented 6 years ago

Thank you! I'll try and let you know!

ycui123 commented 6 years ago

Hi @AlexeyAB,

or convert two 16-bit channels to the two 8-bit channels, and set all zeros in the 3rd channel.

I used the above method and trained for 8000 iterations and I only have one class. I found that the model didn't overfit the data with more and more iterations.

Here's what I got for 8000 iterations: for thresh = 0.25, precision = 0.86, recall = 0.65, F1-score = 0.74 for thresh = 0.25, TP = 652, FP = 103, FN = 348, average IoU = 62.57 % mean average precision (mAP) = 0.677931, or 67.79 %

I followed all instructions you gave in https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

Are there any other ways to improve my performance? I want to lower FP as well as FN as much as possible. Should I train for more iterations?

Thank you

EDITED: My object is very small(usually within 100 100) and image is big(around 12004000).

popper0912 commented 6 years ago

Can we use route function to concat the two imge? But I don't know how to write in .cfg file in data layer.