Beckschen / TransUNet

This repository includes the official project of TransUNet, presented in our paper: TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation.
Apache License 2.0
2.38k stars 497 forks source link

> > Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to? #89

Open lgc-china opened 2 years ago

lgc-china commented 2 years ago

Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to?

@andife Hello, this repo also supports RGB image with 3 channels.

The network is original support 3 channels input (See line 386-387 in vit_seg_modeling.py): if x.size()[1] == 1: x = x.repeat(1,3,1,1)

@Beckschen I'm trying to use this model for RGB images. I removed the random rotations (they seemed buggy for RGB images), and instead now get an error on the lines you have mentioned (386-387 in vit_seg_modeling.py). The error is as follows: RuntimeError: Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor image

Originally posted by @aneeshgupta42 in https://github.com/Beckschen/TransUNet/issues/31#issuecomment-825068576

lgc-china commented 2 years ago

can you tell me where did you get the dataset? There are many things I don't understand about data processing。 I don't know which dataset to download in this website https://www.synapse.org/#!Synapse:syn3193805/files/

PatrickWilliams44 commented 2 years ago

Hello, I had the same problem when running test.py, did you solve it?@lgc-china

xinnvY commented 1 year ago

Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to?

@andife Hello, this repo also supports RGB image with 3 channels. The network is original support 3 channels input (See line 386-387 in vit_seg_modeling.py): if x.size()[1] == 1: x = x.repeat(1,3,1,1)

@Beckschen I'm trying to use this model for RGB images. I removed the random rotations (they seemed buggy for RGB images), and instead now get an error on the lines you have mentioned (386-387 in vit_seg_modeling.py). The error is as follows: RuntimeError: Number of dimensions of repeat dims can not be smaller than number of dimensions of tensorimage

Originally posted by @aneeshgupta42 in #31 (comment)

Hello, it seems that the code currently only works on grayscale images. II am interested in processing images with 3 channels (RGB). Has anyone already modified the code accordingly? What do I have to pay attention to?

@andife Hello, this repo also supports RGB image with 3 channels. The network is original support 3 channels input (See line 386-387 in vit_seg_modeling.py): if x.size()[1] == 1: x = x.repeat(1,3,1,1)

@Beckschen I'm trying to use this model for RGB images. I removed the random rotations (they seemed buggy for RGB images), and instead now get an error on the lines you have mentioned (386-387 in vit_seg_modeling.py). The error is as follows: RuntimeError: Number of dimensions of repeat dims can not be smaller than number of dimensions of tensorimage

Originally posted by @aneeshgupta42 in #31 (comment)

maybe RGB images dont need to repeat in channel dimension?because three channels itself!