yuhuixu1993 / PC-DARTS

PC-DARTS:Partial Channel Connections for Memory-Efficient Differentiable Architecture Search
436 stars 108 forks source link

PC-DARTS on medical images classification #41

Closed AissamDjahnine closed 4 years ago

AissamDjahnine commented 4 years ago

Hello, I would like to begin by thanking you for this work. Me as an intern , i am investigating PC-DARTS for my breast-cancer diagnosis task ( classification of different cancers ). and I am trying to find the best architecture possible for this.

And since all your tests were done on natural images datasets ,do you have any comments about using them on such medical datasets? hyper-parameters that we should choose carefully ?

I did run some tests on it and i am saturating at 58 % on validation set, would an end-to-end training can tell me if the cell configuration i got is messy ?

thanks in advance for answering.

yuhuixu1993 commented 4 years ago

Hi, thanks for your interest of our work. I have never down that on medical datasets, but i am willing to help with you. You said you run some tests and got 58% accuracy. What architecture did you use? the searched architecture by our paper, or searched by yourself, our the supernet accuracy?

AissamDjahnine commented 4 years ago

Thanks for your fast reply. I'am still at the stage of : architecture search, so i run it for 100 epochs, i got a low validation over all epochs ( fluctuating between 40 and 58 % ) , same thing for the training accuracy.

I use 180x180 breast cancer patches, i got about 3996 images for train, 1330 for validation. So i may say that i am in a situation of "lack of data", the other question is regarding the : arch learning rate , the main model learning rate, since they're tuned to CIFAR10 or 100, it might be challenging for a medical dataset.

i'm stacking with this architecture for the last epochs :

genotype = Genotype(normal=[('dil_conv_3x3', 1), ('dil_conv_5x5', 0), ('sep_conv_5x5', 1), ('dil_conv_5x5', 2), ('dil_conv_3x3', 3), ('dil_conv_3x3', 1), ('avg_pool_3x3', 1), ('dil_conv_3x3', 4)], normal_concat=range(2, 6), reduce=[('avg_pool_3x3', 1), ('skip_connect', 0), ('skip_connect', 1), ('dil_conv_5x5', 2), ('skip_connect', 1), ('dil_conv_3x3', 3), ('skip_connect', 1), ('dil_conv_5x5', 4)], reduce_concat=range(2, 6))

I'll appreciate your help.

AissamDjahnine commented 4 years ago

I may also have some comments about memory uses , i am running my code on : Quadro RTX 6000 with ( batch size of 16, same initial channels and layers as main script(16,8) and 180x180 resolution images as i mentioned before. So i am kind of limited to use a higher batch size or a deeper net.

yuhuixu1993 commented 4 years ago

Hi, first of all, I need to mention that the accuracy of the supernet during the search phase may not that important(e.g. the validation accuracy on Imagenet is only 32%). You need to evaluate the architecture searched on the whole dataset. Second, as the image size of your dataset is much larger that cifar you need to first downsample the image to (32x32) using some downsampling layers just like the search on ImageNet. You may shorten the number of search epochs I think. learning rate you should follow the common settings on the medical dataset.

AissamDjahnine commented 4 years ago

Hello again, Thanks for the reply, i guess the down sampling is a good idea. Another unclear point, in your case you said you got 32% on validation during the search phase but still had good performance for the whole training. So is there any signs ( besides loss ) to say that my search phase was good ?

Thanks in advance for your answer.