loolzaaa / faster-rcnn-pytorch

A PyTorch implementation of Faster R-CNN
MIT License
17 stars 5 forks source link

How to train a multiple channel image #8

Open lihaolin88 opened 2 years ago

lihaolin88 commented 2 years ago

Hello, thank you for this project! Right now I want to train this network with 2 channel image, but in the code, I saw it just accept 1 or 3 channels image, so I just wondering is that okay for me to add some layers in faster_rcnn.py or is here have some other way to process 2 channels image? Thank you so much!

loolzaaa commented 2 years ago

Hm... In this lines of code of collate.py file:

    if len(im.shape) == 2:
        im = im[:, :, np.newaxis]
        im = np.concatenate((im, im, im), axis=2)

you can see check, is there third dimension in image DATA. So grayscale image has only one dimension, but RGB has three dimension (1 - for X coord, 2 - for Y coord, 3 - for channels -> R, G, B).

I suppose your 2 channel image: grayscale + alpha. So, your image DATA still has three dimensions (1 - for X coord, 2 - for Y coord, 3 - for channels -> Gray, Alpha). Or am i wrong?

lihaolin88 commented 2 years ago

Thank you for your reply! this two-channel image is generated by myself, so it exactly just have 2 channels; what I'm doing right now is add some layers in faster_rcnn.py transform 2 channels to 3 channels, but it seems not works right.

loolzaaa commented 2 years ago

this two-channel image is generated by myself, so it exactly just have 2 channels

I'm understood it, but after PyTorch read your image as data array, it must have 3 dimension.

BEFORE you change some code, WHERE you get an error, concrete link, please.